基于时域扩张残差网络和双分支结构的人体行为识别
CSTR:
作者:
作者单位:

1. 青岛科技大学 信息科学技术学院,山东 青岛 266061;2. 智能感知与自主控制教育部工程研究中心,北京 100124

作者简介:

通讯作者:

E-mail: lihui@qust.edu.cn.

中图分类号:

TP391

基金项目:

智能感知与自主控制教育部工程研究中心开放基金项目(K100052021006);国家自然科学基金项目(61702295);山东省高等学校优秀青年创新团队计划项目(2019KJN047).


Human behavior recognition based on time domain extended residual network and dual branching structure
Author:
Affiliation:

1. College of Information Science and Technology,Qingdao University of Science and Technology,Qingdao 266061,China;2. Engineering Research Center of Intelligence Perception and Autonomous Control of Ministry of Education,Beijing 100124,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    图卷积网络由于能够直接处理关节点拓扑图在行为识别方面表现出较好的性能而备受关注,但是这类方法中经常存在长时信息依赖建模能力较弱以及未关注空间语义与时间事件变化不均衡问题,对此,提出基于时域扩张残差网络和双分支结构的人体行为识别方法.在时空行为特征提取方法中,不仅用图卷积提取空间域特征,而且用扩张因果卷积和残差连接结构来构建时域扩张残差网络以提取时域特征,该网络能够在未大量增加参数的基础上有效扩大在时域上的感受野,从而更好地获得在时域上的人体关节信息的长时依赖关系.同时构建双分支结构,其中低帧率分支以较少的时间帧数和较多的通道数侧重于提取丰富的空间语义信息,高帧率分支以较多的时间帧数和较少的通道数在保证网络轻量级的前提下有效捕捉人体行为的快速变化.实验结果表明,所提出方法在NTU RGB$ + $D数据集上的准确率高于目前先进的行为识别方法.

    Abstract:

    The graph convolution network has attracted much attention because it can directly process the topological graph of joint points and has good performance in behavior recognition. However, this kind of methods often have the problems of weak long-term information dependence modeling ability and not paying attention to the imbalance between spatial semantics and temporal events. Therefore, a human behavior recognition method based on the time-domain extended residual network and a dual-branch structure is proposed. In the method of spatiotemporal behavior feature extraction, not only the graph convolution is used to extract spatial domain features, but also the extended causal convolution and the residual connection structure are used to construct the time-domain extended residual network to extract time-domain features. The network can effectively expand the receptive field in time domain without increasing a large number of parameters, so as to better obtain the long-term dependence of human joint information in time domain. At the same time, a dual branch structure is constructed, in which the low frame rate branch focuses on extracting rich spatial semantic information with less time frames and more channels, while the high frame rate branch focuses on capturing the rapid changes of human behavior with more time frames and less channels. The accuracy on the NTU RGB $ + $ D data set is higher than the current advanced behavior recognition methods.

    参考文献
    相似文献
    引证文献
引用本文

薛盼盼,刘云,李辉,等.基于时域扩张残差网络和双分支结构的人体行为识别[J].控制与决策,2022,37(11):2993-3002

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2022-09-30
  • 出版日期: 2022-11-20
文章二维码