基于强化学习的倒立摆分数阶梯度下降RBF控制
作者:
作者单位:

(集美大学航海学院,福建厦门361021)

作者简介:

通讯作者:

E-mail: imlmd@163.com.

中图分类号:

TP18

基金项目:

国家自然科学基金项目(51579114);福建省自然科学基金项目(2018J05085).


Reinforcement learning based fractional gradient descent RBF neural network control of inverted pendulum
Author:
Affiliation:

(Institute of Navigation,Jimei University,Xiamen361021,China)

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了提高强化学习的控制性能,提出一种基于分数梯度下降RBF神经网络的强化学习算法.通过评价神经网络和执行神经网络组成强化学习系统,利用神经网络记忆和联想,学会控制倒立摆,提高控制精度,使误差趋于零,直至学习成功,并证明闭环系统的稳定性.通过倒立摆的物理实验发现,当分数阶阶数较大,微分的作用更显著,对角速度和速度的控制效果更好,角速度和速度的均方误差和平均绝对误差较小;当分数阶阶数较小,积分的作用更显著,对倾斜角和位移的控制效果更好,因此倾斜角和位移的均方误差和平均绝对误差较小.仿真实验的结果表明,所提算法动态响应好,超调量小,调整时间短,精度高,泛化性能好.它优于基于RBF神经网络的强化学习算法和传统强化学习算法,能有效地加快梯度下降法的收敛速度,提高其控制性能.在引入适当的干扰后,所提算法能够快速地自我调节并恢复稳定状态,控制器的鲁棒性和动态性能满足实际要求.

    Abstract:

    In order to improve the control performance of reinforcement learning, a reinforcement learning algorithm based on the fractional gradient descent RBF neural network is proposed. Based on the evaluation neural network and action neural network, the reinforcement learning system uses neural network memory and association, and learns to control the inverted pendulum. The control accuracy is improved with the error tending to zero until the learning is successful. The stability of the closed-loop system is proved. The physical experiment of inverted pendulum is carried out. It is pointed that when the fractional order is large, the differential effect is more significant, the control effect of diagonal velocity and velocity is better, and the mean square error and mean absolute error of angular velocity and velocity are smaller. When the fractional order is small, the effect of integral is more significant, and the control effect on tilt angle and displacement is better. The results indicate that the algorithm has good dynamic response, small overshoot, short adjustment time, high precision and good generalization performance. It is superior to the reinforcement learning algorithm based on the RBF neural network and the traditional reinforcement learning algorithm. It can effectively accelerate the convergence speed of the gradient descent method and improve its control performance. After introducing appropriate disturbance, the controller can quickly self-adjust and recover the stable state. The robustness and dynamic performance of the controller meet the actual requirements.

    参考文献
    相似文献
    引证文献
引用本文

薛晗,邵哲平,方琼林,等.基于强化学习的倒立摆分数阶梯度下降RBF控制[J].控制与决策,2021,36(1):125-134

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2021-01-06
  • 出版日期: 2021-01-20