一种新的基于强化学习改进SAR的无人机路径规划
CSTR:
作者:
作者单位:

1. 广西民族大学 人工智能学院,南宁 530006;2. 广西混杂计算与集成电路设计分析重点实验室,南宁 530006

作者简介:

通讯作者:

E-mail: 25713893@qq.com.

中图分类号:

TP301.6

基金项目:

国家自然科学基金项目(62062011);广西民族大学研究生创新计划项目(gxun-chxs2021057).


A novel modified search and rescue optimization algorithm based on reinforcement learning for UAV path planning
Author:
Affiliation:

1. College of Artificial Intelligence,Guangxi Minzu University,Nanning 530006,China;2. Guangxi Key Laboratory of Hybrid Computation and IC Design Analysis,Nanning 530006,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    搜索和救援优化算法(SAR)是2020年提出的模拟搜救行为的一种元启发式优化算法,用来解决工程中的约束优化问题.但是,SAR存在收敛慢、个体不能自适应选择操作等问题,鉴于此,提出一种新的基于强化学习改进的SAR算法(即RLSAR).该算法重新设计SAR的局部搜索和全局搜索操作,并增加路径调整操作,采用异步优势演员评论家算法(A3C)训练强化学习模型使得SAR个体获得自适应选择算子的能力.所有智能体在威胁区数量、位置和大小均随机生成的动态环境中训练,进而从每个动作的贡献、不同威胁区下规划出的路径长度和每个个体的执行操作序列3个方面对训练好的模型进行探索性实验.实验结果表明,RLSAR比标准SAR、差分进化算法、松鼠搜索算法具有更高的收敛速度,能够在随机生成的三维动态环境中成功地为无人机规划出更加经济且安全有效的可行路径,表明所提出算法可作为一种有效的无人机路径规划方法.

    Abstract:

    The search and rescue optimization algorithm(SAR) proposed in 2020 is a meta-heuristic optimization algorithm. It simulates the search and rescue behavior, which is used to solve constrained engineering optimization problems. However, the SAR has slow convergence and its individuals can not adaptively select operations. A modifed version of the SAR based on reinforcement learning, namely RLSAR, is proposed, which redesigns the local search and global search of the SAR, and adds path adjustment operation. The asynchronous advanced actor critic algorithm(A3C) is used to train the reinforcement learning model so that the SAR individuals acquire the ability to adaptively select operators. All agents are trained in a dynamic environment in which the number, location and size of threat areas are randomly generated, and then exploratory experiments are conducted on the trained model from three aspects: The contribution of each action, the path length planned under different threat areas, and the execution sequence of each individual. The results show that the RLSAR has higher convergence speed than the standard SAR, the differential evolution algorithm and the squirrel search algorithm. Furthermore, it can successfully plan a more economical, safe and effective feasible path for an unmanned aerial vehicle(UAV) in a randomly generated three-dimensional dynamic environment, which shows that the proposed algorithm can serve as an effective path planning method for UAVs.

    参考文献
    相似文献
    引证文献
引用本文

周文娟,张超群,汤卫东,等.一种新的基于强化学习改进SAR的无人机路径规划[J].控制与决策,2024,39(4):1203-1211

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-03-15
  • 出版日期: 2024-04-20
文章二维码