基于动态混合注意力的自知识蒸馏
CSTR:
作者:
作者单位:

江南大学 轻工过程先进控制教育部重点实验室,江苏 无锡 214122

作者简介:

通讯作者:

E-mail: chenying@jiangnan.edu.cn.

中图分类号:

TP391

基金项目:

国家自然科学基金项目(62173160).


Self-knowledge distillation based on dynamic mixed attention
Author:
Affiliation:

Key Laboratory of Advanced Process Control for Light Industry of Ministry of Education,Jiangnan University,Wuxi 214122,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    自知识蒸馏降低了对预训练教师网络的依赖,但是其注意力机制只关注图像的主体部分,一方面忽略了携带有颜色、纹理信息的背景知识,另一方面空间注意力的错误聚焦情况可能导致主体信息遗漏.鉴于此,提出一种基于动态混合注意力的自知识蒸馏方法,合理挖掘图像的前背景知识,提高分类精度.首先,设计一个掩膜分割模块,利用自教师网络建立注意力掩膜并分割出背景特征与主体特征,进而提取背景知识和遗漏的主体信息;然后,提出基于动态注意力分配策略的知识提取模块,通过引入基于预测概率分布的参数动态调整背景注意力和主体注意力的损失占比,引导前背景知识相互协作,逐步优化分类器网络对图像的关注,提高分类器网络性能.实验结果表明:所提出方法使用ResNet18网络和WRN-16-2网络在CIFAR100数据集上的准确率分别提升了2.15%和1.54%;对于细粒度视觉识别任务,使用ResNet18网络在CUB200数据集和MIT67数据集上的准确率分别提高了3.51%和1.05%,其性能优于现有方法.

    Abstract:

    Self-knowledge distillation reduces the necessity of training a large teacher network, whose attention mechanism only focuses on the foreground of the image. It ignores the background knowledge with color and texture information, furthermore may lead to the omission of the foreground information due to the wrong focus of spatial attention. To address the problem, a self-knowledge distillation method based on dynamic mixed attention is proposed, which reasonably exploits both foreground and background information in images and therefore improves the classification accuracy. A mask segmentation module is designed to segment the feature map of background and foreground, which are used to extract the ignored background knowledge and the missing foreground information respectively. Moreover, a knowledge extraction module based on dynamic attention distribution strategy is proposed, which dynamically adjusts the loss ratio of background attention and foreground attention by introducing a parameter based on predictive probability distribution. The strategy guides the cooperation between foreground and background, which leads to more accurate attention map and improves the performance of a classifier network. Experiments show that the proposed method using ResNet18 and WRN-16-2 improves the accuracy on CIFAR100 by 2.15% and 1.54% respectively. For fine-grained visual recognition tasks, the accuracy on CUB200 dataset and MIT67 dataset is improved by 3.51% and 1.05% respectively, which makes its performance superior to the state-of-the-arts.

    参考文献
    相似文献
    引证文献
引用本文

唐媛,陈莹.基于动态混合注意力的自知识蒸馏[J].控制与决策,2024,39(12):4099-4108

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-11-20
  • 出版日期: 2024-12-20
文章二维码