基于增量式Q学习的固定翼无人机跟踪控制性能优化
CSTR:
作者:
作者单位:

南京航空航天大学 自动化学院,南京 211106

作者简介:

通讯作者:

E-mail: zhaozhengen@nuaa.edu.cn.

中图分类号:

TP273

基金项目:

国家自然科学基金项目(62003161);江苏省自然科学基金项目(BK20190399);中国博士后科学基金项目(2021M701701).


Performance optimization for tracking control of fixed-wing UAV with incremental Q-learning
Author:
Affiliation:

College of Automation Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对固定翼无人机纵向控制的高性能需求,提出一种控制系统性能优化结构.该结构包括一个使系统稳定的标称控制器和一个参与性能优化的增量式控制器.控制系统增量式的实现不会改变原有的控制系统,而是仅对标称控制系统做控制输入的补偿与控制性能的优化.基于Q学习理论进行增量式控制器设计,针对状态信息完全可获得的系统,设计一种基于状态反馈的增量式Q学习算法.当状态信息不能完全获得时,利用系统输入、输出和参考信号数据,设计一种基于输出反馈的增量式Q学习算法.两种增量式控制器均是在数据驱动环境下自适应学习增量式控制律,无需提前知道系统动力学模型以及标称控制器的控制增益.此外,证明了增量式Q学习方法在满足持续激励条件的激励噪声下,对Q函数贝尔曼方程的求解没有偏差.最后,通过对F-16飞行器纵向模型实例的仿真验证该方法的有效性.

    Abstract:

    Aiming at the high performance requirements of longitudinal control of a fixed-wing unmanned aerial vehicle(UAV), a performance optimization structure of the control system is proposed. This structure includes a nominal controller that stabilizes the system and an incremental controller that participates in performance optimization. The incremental implementation of the control system does not change the original control system, but compensates the control input and optimizes the control performance for the nominal control system exclusively. Based on the Q-learning theory, the incremental controller is designed. For the system with completely available state information, an incremental Q-learning algorithm based on state feedback is developed. When the state information cannot be obtained completely, an incremental Q-learning algorithm based on output feedback is designed by using the system input, output and reference trajectory data. Both incremental controllers learn incremental control laws adaptively in the data-driven environment without the need for system dynamics model and the control gain of the nominal controller. In addition, it is proved that the incremental Q-learning method has no bias in solving the Q-function Bellman equation under the excitation noise. Finally, the effectiveness of the method is verified by the simulation of an example of the longitudinal model of the F-16 aircraft.

    参考文献
    相似文献
    引证文献
引用本文

赵振根,程磊.基于增量式Q学习的固定翼无人机跟踪控制性能优化[J].控制与决策,2024,39(2):391-400

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-01-18
  • 出版日期: 2024-02-20
文章二维码