Abstract:For model-free nonlinear system with continuous state space, a multi-step reinforcement learning control algorithm based on RBF neural network is proposed in this paper. Firstly, the neural network is introduced to reinforcement learning system for approximating state-action value function, which is a common solution to the problem of continuous state space expression in reinforcement learning. Then, combined with the eligibility trace mechanism,multi-step algorithm Sarsa($\lambda$) is formed to improve the learning efficiency of the system by recording the experienced states. Finally, softmax strategy is improved by decayed temperature parameter, so as to optimize the selection probability of actions and balance the relationship between exploration and exploitation. The simulation results of MountainCar task show that the proposed algorithm can effectively achieve the model-free control task of continuous nonlinear system through fewer times of training. Compared with the single-step algorithm, the multi-step algorithm takes less average convergent steps to complete the task and perform more stable, which proves that the combination of nonlinear value function approximation and multi-step algorithm has good performance in the control task.