一种利用知识迁移的卷积神经网络训练策略

doi:10.13195/j.kzyjc.2017.1183

首页 > 过刊浏览>2019年第34卷第3期 >511-518. DOI:10.13195/j.kzyjc.2017.1183

一种利用知识迁移的卷积神经网络训练策略
DOI:
                        10.13195/j.kzyjc.2017.1183
                    
CSTR:
                        
                    
作者:
                        罗可罗可
长沙理工大学计算机与通信工程学院，长沙410114
在期刊界中查找
在百度中查找
在本站中查找
周安众周安众
长沙理工大学计算机与通信工程学院，长沙410114
在期刊界中查找
在百度中查找
在本站中查找
罗潇罗潇
长沙理工大学计算机与通信工程学院，长沙410114
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:(长沙理工大学计算机与通信工程学院，长沙410114)
作者简介:罗可(1961-), 男, 教授, 博士, 从事数据挖掘、计算机应用等研究；周安众(1986-), 男, 硕士生, 从事数据挖掘、人工智能的研究.
通讯作者:E-mail: sprite4@163.com.
中图分类号:TP181
基金项目:国家自然科学基金项目(11671125,71371065,51707013).

Convolutional neural network training strategy using knowledge transfer

Author:

LUO Ke
LUO Ke
College of Computer and Communication Engineering,Changsha University of Science and Technology,Changsha 410114,China
在期刊界中查找
在百度中查找
在本站中查找
ZHOU An-zhong
ZHOU An-zhong
College of Computer and Communication Engineering,Changsha University of Science and Technology,Changsha 410114,China
在期刊界中查找
在百度中查找
在本站中查找
LUO Xiao
LUO Xiao
College of Computer and Communication Engineering,Changsha University of Science and Technology,Changsha 410114,China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

(College of Computer and Communication Engineering,Changsha University of Science and Technology,Changsha 410114,China)

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对深层卷积神经网络在有限标记样本下训练时存在的过拟合和梯度弥散问题,提出一种从源模型中迁移知识训练一个深层目标模型的策略.迁移的知识包括样本的类别分布和源模型的低层特征,类别分布提供了样本的类间相关信息,扩展了训练集的监督信息,可以缓解样本不足的问题;低层特征包含样本的局部特征,在相关任务的迁移过程中具有一般性,可以使目标模型跳出局部最小值区域.利用这两部分知识对目标模型进行预训练,能够使模型收敛到较好的位置,之后再用真实标记样本进行微调.实验结果表明,所提方法能够增强模型的抗过拟合能力,并提升预测精度.

关键词:卷积神经网络;知识迁移;过拟合;梯度弥散;预训练;微调

Abstract:

To overcome the overfitting and gradient vanishing of deep convolutional neural networks trained under limited labeled samples, a strategy is proposed to transfer knowledge from a source model to a deep target model. The transferred knowledge includes class distribution of the samples and low-level features of the source model. The class distribution provides class-related information about the samples, which extends the supervised informations of the training set to alleviate the problem of inadequate samples. The low-level feature contains the local characteristics of the samples, which is general in the process of transfer knowledge, and can make the target model jump out of the local minimum value area. Then, the two parts of knowledge are applied to the pre-training target model to make the model converge to a better position, and real labeled samples are used for fine-tuning. The experimental results show that the proposed method can both improve the anti overfitting ability of the model and prediction accuracy.

Key words:convolutional neural network;knowledge transfer;overfitting;gradient vanishing;pre-training;fine-tuning

引用本文

罗可,周安众,罗潇.一种利用知识迁移的卷积神经网络训练策略[J].控制与决策,2019,34(3):511-518

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:
最后修改日期:
录用日期:
在线发布日期: 2019-03-04
出版日期:

首页

期刊简介

编委会

作者中心

精选专辑

品牌联动

引用本文

分享

文章指标

历史

文章二维码

首页

期刊简介

编委会

作者中心

精选专辑

品牌联动

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码