深度神经网络学习率策略研究进展

doi:10.13195/j.kzyjc.2022.0147

首页 > 过刊浏览>年第0卷第9期 >2444-2460. DOI:10.13195/j.kzyjc.2022.0147

深度神经网络学习率策略研究进展
DOI:
                        10.13195/j.kzyjc.2022.0147
                    
作者:
                        
                        
                    
作者单位:四川大学
作者简介:
通讯作者:
中图分类号:TP273
基金项目:智能电网四川省重点实验室应急重点项目 (020IEPG-KL-20YJ01)；德阳科技 (揭榜) 项目 (2021JBJZ007)； 四川大学华西医院 1-3-5 优秀学科项目 (ZYJC21041)；四川省科技计划 (2022YFS0178).

Research Advances in Deep Neural Network Learning Rate Strategies

Author:

Affiliation:

Sichuan University

Fund Project:

Smart grid sichuan provincial key laboratory emergency key project (No. 020IEPG-KL-20YJ01);Deyang Science and Technology (unveiling the list) Program (No. 2021JBJZ007);1·3·5 project for disciplines of excellence,West China Hoispital,Sichuan University(No.ZYJC21041); Sichuan Science and Technology Program (No. 2022YFS0178);

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

学习率 (Learning rate, LR) 是深度神经网络 (Deep neural networks, DNNs) 能够进行有效训练的重要超参数. 然而, 学习率的调整在 DNNs 训练过程中仍存在诸多困难与挑战, 即使以恒定的学习率选择为目标, 为训练 DNNs 选择一个最优的恒定初始学习率也非易事. 而动态学习率涉及到训练过程的不同阶段需对学习率进行多步调整, 以达到高精确度和快速收敛的目的? 调整过程中学习率过小可能会导致模型收敛缓慢或陷入局部最优值, 而学习率过大则会阻碍收敛, 造成震荡发散. 本文综述了近年来基于深度学习算法的学习率研究进展, 并对包括分段衰减学习率, 平滑衰减学习率, 循环学习率, 具有热启动的学习率四种类型的学习率簇在几个常见数据集上的性能表现进行了测试分析和对比研究, 包括收敛速度, 鲁棒性和均值方差等? 最后总结了全文, 并对该领域仍存在的问题以及未来的研究趋势进行了展望.

Abstract:

Learning rate (LR) is an important hyperparameter for effective training of deep neural networks (DNNs). However, there are still many difficulties and challenges in tuning the learning rate during the training of DNNs, and it is not easy to choose an optimal constant initial learning rate for training DNNs even with the goal of constant learning rate selection. The dynamic learning rate involves multi-step adjustment of the learning rate at different stages of the training process to achieve high accuracy and fast convergence? too small a learning rate in the adjustment process may cause the model to converge slowly or fall into a local optimum, while too large a learning rate may hinder convergence and cause oscillation and scattering. In this paper, we summarize the progress of learning rate research based on deep learning algorithms in recent years, and test and compare the performance of four types of learning rate clusters, including segmented decay learning rate, smooth decay learning rate, cyclic learning rate, and learning rate with hot start, on several common data sets, including convergence speed, robustness, and mean variance, etc. Finally, we summarize the full paper and discuss the remaining problems and future research trends in this field. Finally, we conclude the paper and give an outlook on the remaining problems and future research trends in this field.

参考文献

相似文献

引证文献

引用本文

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2022-01-21
最后修改日期:2023-01-15
录用日期:2022-05-31
在线发布日期: 2022-06-13
出版日期:

首页

期刊简介

编委会

作者中心

精选专辑

品牌联动

引用本文

分享

文章指标

历史