深度立体匹配网络的融合代价体及其代价聚合方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP391

基金项目:

国家自然科学基金项目(52374165).


Fusion cost volume and cost aggregation method for deep stereo matching network
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    深度立体匹配网络使用代价体将三维场景结构编码为双目特征的对应关系, 在机器人定位与避障等场景具有重要应用前景. 然而, 现有代价体方法不能为双目特征建立全面且无冗余的相关信息, 导致视差预测精度不足. 针对该问题, 首次将极线几何约束引入代价体计算流程, 通过多类型代价体优势互补, 提出一种即插即用的融合代价体及其代价聚合方法. 首先, 融合代价体同步计算极线共投影区域内特征向量的全局点积相关信息和局部分组点积相关信息, 保证特征相关的全面性并有效避免了信息冗余; 其次, 在周边信息聚合过程中结合传统聚合方法和融合代价体特性, 提出一种基于深度可分离卷积的自适应加权降维方法, 解决融合代价体在聚合阶段的维度不平衡性和计算效率问题. 将所提方法集成到立体匹配框架并命名为FusionStereo, 在基准数据集上进行实验验证. 结果表明: FusionStereo在KITTI 2015域内训练后的误匹配率指标BAD3为1.55%, 在MiddleBurry跨域测评的误匹配率指标BAD1为17.1%, 明显优于其他类型代价体的对比方法.

    Abstract:

    Stereo matching networks encode 3D scene structure into binocular feature correspondences through cost volumes, which has important application in robotic localization and obstacle avoidance. However, the existing cost volume methods fail to establish comprehensive yet non-redundant correlations between binocular features, leading to low disparity accuracy. To address this issue, this research introduces epipolar geometric constraints into the cost volume computation and proposes a plug-and-play fusion cost volume and cost aggregation method by leveraging the complementary advantages of multiple cost volume types. First, the proposed fusion cost volume calculates global dot-product correlations and local grouped dot-product correlations of feature vectors within the epipolar co-projection region simultaneously, which ensures the comprehensiveness of feature correlations information and effectively avoids information redundancy. Then, during information aggregation, this research combines traditional aggregation methods with the characteristics of the fusion cost volume to propose an adaptive weighted dimensionality reduction method based on depthwise separable convolution, which addresses the dimensionality imbalance and computational efficiency issues in the feature aggregation process of the fusion cost volume. The stereo matching framework integrating the proposed method is named FusionStereo, and experimental verification is carried out on the benchmark data sets. Results show that the BAD3 of FusionStereo is 1.55% after training on KITTI 2015, and the BAD1 of Middlebury cross-domain evaluation is 17.1%, which is significantly better than other cost volume methods.

    参考文献
    相似文献
    引证文献
引用本文

邹正阳,伍云霞,徐倩.深度立体匹配网络的融合代价体及其代价聚合方法[J].控制与决策,2025,40(10):3145-3154

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-12-18
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-09-09
  • 出版日期: 2025-10-20
文章二维码