结合图神经网络和深度强化学习的机器人自主探索方法
DOI:
作者:
作者单位:

北京工业大学

作者简介:

通讯作者:

中图分类号:

TP273

基金项目:

国家重点研发计划


A Robot Autonomous Exploration Method Combining Graph Neural Network and Deep Reinforcement Learning
Author:
Affiliation:

Beijing University of Technology

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    移动机器人在未知环境中的自主探索是机器人研究的重要问题.未知环境的探索策略存在探索过程中定位不确定性较大、决策时间长、探索速率慢和鲁棒性差等问题.采用图神经网络和深度强化学习结合的方法可以有效规避这些问题.本文基于这一架构,提出了一种新的机器人自主探索方法。该方法引入虚拟地标(Virtual Landmarks)概念来描述环境地图,首先基于 EM(Expectation Maximum)探索策略来更新虚拟地标所表示的地标存在概率及其对不确定性的估计,然后根据机器人和环境中的信息构建探索图,以此为数据结构来表征环境信息,提高探索策略对地图大小的鲁棒性;再将其输入门控图神经网络,以挖掘数据之间的隐藏信息帮助更新图节点信息;最后,基于门控图神经网络构建双深度 Q 学习网络(Double Deep Q-learning Network,DDQN), 从而降低噪声对动作选择的影响并提高探索策略的性能.我们在仿真环境下进行了实验,并与其他自主探索方法进行了比较.实验证明该探索策略决策时间短,对地图大小变化的鲁棒性好,能够在提高探索速率的同时实现更高的地图精度.

    Abstract:

    Autonomous exploration of mobile robot in unknown environment is an important problem in robot research. The exploration strategy of unknown environment has some problems, large localization uncertainty, long decision-making time, slow exploration rate and poor robustness. These problems can be effectively avoided by using graph neural network and deep reinforcement learning. Based on this architecture, a new robot autonomous exploration method is proposed in this paper. In this method, the concept of virtual landmarks is introduced to describe the environment map. Firstly, the existence probability of landmarks represented by virtual landmarks and its estimation of uncertainty are updated based on EM (expectation maximum) exploration strategy, and then the exploration map is constructed according to the information about the robot and environment, which is used as the data structure to represent the environmental information and improve the robustness of the exploration strategy to the map size. Then the exploration graph is input into the gated graph neural network to mine the hidden information between the data and help update the graph node information. Finally, a double deep Q-learning network (ddqn) is constructed based on the gated graph neural network, so as to reduce the impact of noise on action selection and improve the performance of the exploration strategy. We carried out experiments in the simulation environment, and compared with other autonomous exploration methods. Experiments show that the exploration strategy has short decision time and good robustness to map size changes, and it can improve the exploration speed and achieve higher map accuracy.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-11-05
  • 最后修改日期:2022-04-01
  • 录用日期:2022-04-08
  • 在线发布日期:
  • 出版日期: