基于深度强化学习的无地图移动机器人导航
CSTR:
作者:
作者单位:

1. 大连海洋大学 信息工程学院,辽宁 大连 116023;2. 大连海洋大学 设施渔业教育部重点实验室,辽宁 大连 116023;3. 辽宁省海洋信息技术重点实验室,辽宁 大连 116023;4. 大连民族大学机电工程学院,辽宁 大连 116023

作者简介:

通讯作者:

E-mail: linyuanshan@dlou.edu.cn.

中图分类号:

TP242

基金项目:

国家自然科学基金项目(61603067);辽宁省自然科学基金项目(2020-KF-12-09);大连市高层次人才创新支持计划项目(2017RQ053);辽宁省重点研发计划项目(2020JH2/10100043);辽宁省教育厅基金项目(LJKZ0730,QL202016,JL202015).


Mapless navigation based on deep reinforcement learning for mobile robots
Author:
Affiliation:

1. School of Information Engineering,Dalian Ocean University,Dalian 116023,China;2. Key Laboratory of Facility Fisheries Ministry of Education,Dalian Ocean University,Dalian 116023,China;3. Liaoning Provincial Key Laboratory of Marine Information Technology,Dalian 116023,China;4. College of Mechanical,Dalian Minzu University,Dalian 116023,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对传统导航方法对地图精度依赖和动态复杂场景适应差问题,提出一种基于课程学习的深度强化学习无地图自主导航算法.为了克服智能体稀疏奖励情况下学习困难的问题,借鉴课程学习思想,提出一种基于能力圈课程引导的深度强化学习训练方法.此外,为了更好地利用机器人当前的碰撞信息辅助机器人做动作决策,引入碰撞概率的概念,将机器人当前感知到的障碍物信息以一种高层语义的形式进行表示,并将其作为导航策略输入的一部分编码至机器人当前观测中,以简化观测到动作的映射,进一步降低学习的难度.实验结果表明,所提出的课程引导训练和碰撞概率可令导航策略收敛速度明显加快,习得的导航策略在空间更大的场景成功率到达90%以上,行驶耗时减少53.5%sim73.1%,可为非结构化未知环境下的无人化作业提供可靠导航.

    Abstract:

    Aiming at the problem that traditional navigation methods are dependent on map accuracy and have poor adaptability to dynamic and complex scenes, a deep reinforcement learning map-free autonomous navigation algorithm based on curriculum learning is proposed. In order to overcome the problem of learning difficulty in the case of sparse reward, a course-guided deep reinforcement learning training method based on circle of competence is proposed by drawing on the idea of curriculum learning. In addition, in order to make better use of the current collision information of the robot to assist the robot to make action decisions, the concept of collision probability is introduced, and the obstacle information currently perceived by the robot is represented in a high-level semantic form. It is encoded into the current observation of the robot as part of the input of the navigation strategy to simplify the mapping of the observation to the action and further reduce the difficulty of learning. The experimental results show that the convergence speed of the strategy is significantly accelerated after the training of the proposed course, and the success rate reaches more than 90% in larger scenes, and the driving time is reduced by 53.5%sim73.1%. It can provide reliable navigation for unmanned operations in unstructured unknown environments.

    参考文献
    相似文献
    引证文献
引用本文

户高铭,蔡克卫,王芳,等.基于深度强化学习的无地图移动机器人导航[J].控制与决策,2024,39(3):985-993

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-02-25
  • 出版日期: 2024-03-20
文章二维码