Abstract:An optimal control based on Euler reinforcement learning(ERL) is proposed for a class of nonlinear uncertain dynamic systems. In this method, the reinforcement learning algorithm is employed to approximate unknown nonlinearfunctions in the plant, and the online learning rule for the reward function and the policy function is derived. The value function is estimated and the control policy is improved by using the way of implementing the temporal difference(TD) errors which are discretized by using the forward Euler approximation of time derivative. Based on the value-gradient and TD error performance index, the steps of the algorithm and error estimation theorem are given. Simulation results for the mountain-car problem show the effectiveness of the presented method.