基于分层深度强化学习的移动机器人导航方法
CSTR:
作者:
作者单位:

中国科学技术大学 信息科学技术学院, 合肥 230027

作者简介:

通讯作者:

E-mail: aoli@ustc.edu.cn.

中图分类号:

TP242

基金项目:

中国科学技术大学优秀引进人才基金项目(KY2100000021);国家自然科学基金项目(61971393, 61871361).


Navigation method for mobile robot based on hierarchical deep reinforcement learning
Author:
Affiliation:

School of Information Science and Technology,University of Science and Technology of China,Hefei 230027,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对现有基于深度强化学习(deep reinforcement learning, DRL)的分层导航方法在包含长廊、死角等结构的复杂环境下导航效果不佳的问题,提出一种基于option-based分层深度强化学习(hierarchical deep reinforcement learning,HDRL)的移动机器人导航方法.该方法的模型框架分为高层和低层两部分,其中低层的避障和目标驱动控制模型分别实现避障和目标接近两种行为策略,高层的行为选择模型可自动学习稳定、可靠的行为选择策略,从而有效避免对人为设计调控规则的依赖.此外,所提出方法通过对避障控制模型进行优化训练,使学习到的避障策略更加适用于复杂环境下的导航任务.在与现有DRL方法的对比实验中,所提出方法在全部仿真测试环境中均取得最高的导航成功率,同时在其他指标上也具有整体优势,表明所提出方法可有效解决复杂环境下导航效果不佳的问题,且具有较强的泛化能力.此外,真实环境下的测试进一步验证了所提出方法的潜在应用价值.

    Abstract:

    In order to solve the problem that existing hierarchical navigation methods based on deep reinforcement learning (DRL) perform poorly in complex environments including the structures like long corridors and dead corners, we propose a navigation method for mobile robots based on option-based hierarchical deep reinforcement learning(HDRL). The framework of the proposed method consists of two level control models: a low level model is to obtain policies for avoiding obstacles and reaching the goal respectively, and a high-level behavior selection model is for automatically learning stable and reliable behavior selection policy, which does not rely on manually designed control rules. In addition, a training method for optimizing the obstacle avoidance control model is proposed, which makes the learned obstacle avoidance policy more suitable for the navigation task in complex environments. In comparison with existing DRL-based navigation methods, the proposed method achieves the highest navigation success rate in all simulated test environments used in this paper and shows better overall performance on other metrics, which demonstrates the proposed method can effectively solve the problem of poor navigation performance in complex environments and has strong generalization ability. Moreover, experiments in real-world environment also verify the potential application value of the proposed method.

    参考文献
    相似文献
    引证文献
引用本文

王童,李骜,宋海荦,等.基于分层深度强化学习的移动机器人导航方法[J].控制与决策,2022,37(11):2799-2807

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2022-09-30
  • 出版日期: 2022-11-20
文章二维码