基于RGB-D融合的密集遮挡抓取检测
作者:
作者单位:

中国矿业大学

作者简介:

通讯作者:

中图分类号:

TP911.73: TP391.4

基金项目:

国家自然科学基金资助项目 (51904297, 61901003)


Densely occluded grasp detection based on RGB-D fusion
Author:
Affiliation:

China University of Mining and Technology

Fund Project:

The National Natural Science Foundation of China (51904297, 61901003)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对当前抓取检测模型对密集遮挡物体的检测效果差以及人工数据标注工作量大的问题,提出了基于RGB-D图像融合的目标检测与抓取检测分步骤进行的改进方案.新方案支持在单物体图像训练的抓取检测模型直接应用于密集遮挡的多物体图像场景中.首先,考虑到密集遮挡场景下抓取物具有多尺度的特点,提出子阶段路径聚合(Sub-stage and Path Aggregation, SPA)的多尺度特征融合模块,用于丰富RGB-D特征级别融合的目标检测模型SPA-YOLO-Fusion的高维语义特征信息,以便于检测模型定位所有的抓取物;其次,使用基于RGB-D像素级别融合的GR-ConvNet抓取检测模型估计每个物体的抓取点,并提出背景填充的图像预处理算法来降低密集遮挡物体的相互影响;最后,使用机械臂对目标点进行抓取.在LineMOD数据集上对目标检测模型进行测试,实验结果表明SPA-YOLO-Fusion的mAP比YOLOv3-tiny与YOLOv4-tiny分别提高了10%与7%.从实际场景中采集图像制作YODO_Grasp抓取检测数据集并进行测试,结果表明增加背景填充预处理算法的GR-ConvNet的抓取检测精度比原模型提高了23%.

    Abstract:

    Current grasp detection algorithms suffer from the poor accuracy and time-consuming or expensive data annotation in densely occluded scenes. To address this concern, a step-by-step improved solution for object detection and grasp detection based on RGB-D fusion is proposed. The grasp detection model trained on single-object can be directly applied to densely occluded multi-object scenes. Firstly, considering the multi-scale characteristics of objects in densely occluded scenes, the Sub-stage and Path Aggregation(SPA) multi-scale feature fusion module is proposed to enrich the high dimensional feature characterization of middle fusion detector SPA-YOLO-Fusion, so as to locate all objects. Then GR-ConvNet equipped with RGB-D pixel-level fusion outputs the optimal grasp points of all detected objects. At the same time, the background padding preprocessing algorithm is proposed to reduce the interference of other objects in GR-ConvNet. The mAP of SPA-YOLO-Fusion is 10% and 7% higher than that of YOLOv3-tiny and YOLOv4-tiny on LineMOD dataset, respectively. The grasp detection accuracy of GR-ConvNet equipped with the padding algorithm is improved by 23% compared with the original model on YODO_Grasp dataset, which collected from the actual scene.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-02-17
  • 最后修改日期:2022-11-13
  • 录用日期:2022-05-17
  • 在线发布日期: 2022-06-13
  • 出版日期: