双分支特征融合的视线估计算法

doi:10.13195/j.kzyjc.2024.0586

首页 > 过刊浏览>2025年第40卷第4期 >1247-1256. DOI:10.13195/j.kzyjc.2024.0586

双分支特征融合的视线估计算法
DOI:
                        10.13195/j.kzyjc.2024.0586
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP391.41
基金项目:

Gaze estimation algorithm with dual-branch feature fusion

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

视线估计是一种预测人眼注视位置或注视方向的技术, 在人机交互和计算机视觉的应用中发挥重要作用. 针对特征的差异性和利用率不全面的问题, 提出双分支特征融合的视线估计算法. 首先, 构建Agent Swin Transformer网络与残差网络相结合的双分支网络模型, 对视线特征进行提取, 由改进的Agent Swin Transformer网络构成全局特征提取分支, 逐层提取全局语义特征; 由残差网络构成局部特征提取分支, 提取不同尺度下的局部细节特征. 通过特征融合将特征张量连接在一起, 增强模型的表征能力. 其次, Agent Swin Transformer网络融合高效多尺度注意力模块(EMA)及空间和信道重建卷积模块(SCConv), 以加强特征, 保持信息有效性, 降低复杂性和计算成本. 最后, 结合头部姿态估计进行视线估计得到最终的视线方向, 以减少干扰因素对眼部外观的影响. 在MPIIFaceGaze数据集上进行大量实验, 实验结果表明, 该方法的视线估计角度平均误差为4.23°, 同当前主流的同类方法相比, 所提出算法能够更为准确地进行视线估计.

Abstract:

Gaze estimation, which predicts the position or direction of human eye gaze, plays a crucial role in applications of human-computer interaction and computer vision. To address the issues of feature diversity and incomplete utilization, this paper proposes a dual-branch feature fusion gaze estimation algorithm. Firstly, a dual-branch network model combining an Agent Swin Transformer network with a residual network is constructed to extract gaze features. The improved Agent Swin Transformer network forms the global feature extraction branch, extracting global semantic features layer by layer, while the residual network forms the local feature extraction branch, extracting local detailed features at different scales. Through feature fusion, the feature tensors are concatenated to enhance the model's representation capability. Then, the Agent Swin Transformer network integrates the efficient multi-scale attention(EMA) module and spatial and channel reconstruction convolution (SCConv) module to strengthen features, maintain information effectiveness, and reduce complexity and computational costs. Finally, combined with head pose estimation, the gaze direction is estimated to mitigate the influence of interfering factors on eye appearance. Extensive experiments on the MPIIFaceGaze dataset demonstrate that the proposed method achieves an average gaze estimation angle error of 4.23°. Compared with current mainstream methods of similar kind, the proposed algorithm achieves more accurate gaze estimation.

参考文献

相似文献

引证文献

引用本文

薛楠,刘莉芬,李鹏程.双分支特征融合的视线估计算法[J].控制与决策,2025,40(4):1247-1256

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-05-16
最后修改日期:
录用日期:
在线发布日期: 2025-03-21
出版日期: 2025-04-20

首页

期刊简介

编委会

作者中心

精选专辑

品牌联动

引用本文

相关视频

分享

文章指标

历史

文章二维码