基于帧内关系建模和自注意力融合的多目标跟踪方法
CSTR:
作者:
作者单位:

南京理工大学 计算机科学与工程学院,南京 210094

作者简介:

通讯作者:

E-mail: wanghuanphd@njust.edu.cn.

中图分类号:

TP273

基金项目:

国家自然科学基金项目(61703209,61773215).


Multi-object tracking based on intra-frame relationship modeling and self-attention fusion mechanism
Author:
Affiliation:

School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    多目标跟踪在视频监控领域有重要的应用价值.随着卷积神经网络(convolutional neural networks,CNN),尤其是图神经网络(graph neural networks,GNN)的发展,多目标跟踪的研究现阶段取得了很大突破.其中,图神经网络由于引入目标-轨迹间的关系建模,显示出更稳定的跟踪性能.然而,已有的基于GNN的多目标跟踪方法都仅在连续两帧之间建立全局关系模型,忽视了帧内目标与周围其他目标的交互,没有考虑在帧内建立合适的局部关系模型.为了解决该问题,提出基于帧内关系建模和自注意力融合模型(INAF-GNN)的多目标跟踪方法.在帧内,INAF-GNN建立目标与邻居目标的关系图模型以获取局部跟踪特征;在帧间,INAF-GNN建立目标与轨迹关系图模型以获得全局跟踪特征,并利用注意力机制设计一个特征融合模块整合局部和全局跟踪特征.在MotChallenge行人标准数据集上进行大量的实验,与多个基于图神经网络的多目标跟踪方法相比较,结果显示,MOTA指标提高1.9%,IDF1指标提高3.6%.同时,在UA-DETRAC车辆数据集上的验证测试表明了所提出方法的有效性和泛化能力.

    Abstract:

    Multi-object tracking is a crucial technique of video surveillance. Over the past decade, the convolutional neural networks(CNNs) and especially graph neural networks(GNNs) have made multi-object tracking a great progress, where the GNN show an significant advantages due to modeling the relationship between targets and trajectories. These GNN models, however, mostly consider building a global relationship model for targets and trajectories only in two neighboring frames, neglecting the interactions between an object with the others within a frame. In order to handle this issue, we propose an intra-frame relationship modeling and self-attention fusion method for multi-object tracking. Within a frame, the INAF-GNN builds a relational graph model for an object and its neighboring objects to obtain local tracking features. Across two frames, the INAF-GNN constructs another relational graph model for objects and trajectories to acquire global tracking features. In further, both the local and global tracking features are fed into a feature integration module via a self-attention mechanism. We run various experiments on the pedestrian MotChallenge benchmark datasets, and the experimental results show that the proposed method outperforms GNN-based multi-object tracking methods by 1.9% of MOTA and 3.6% of IDF1. Besides, it is also validated over the vehicle UA-DETRAC datasets. Both demonstrate the effectiveness and generalization capability of the proposed method.

    参考文献
    相似文献
    引证文献
引用本文

朱姝姝,王欢,严慧.基于帧内关系建模和自注意力融合的多目标跟踪方法[J].控制与决策,2023,38(2):335-344

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-01-29
  • 出版日期: 2023-02-20
文章二维码