深度神经网络学习率策略研究进展
CSTR:
作者:
作者单位:

四川大学 电气工程学院,成都 610065

作者简介:

通讯作者:

E-mail: junranzhang@scu.edu.cn.

中图分类号:

TP273

基金项目:

智能电网四川省重点实验室应急重点项目(020IEPG-KL-20YJ01);德阳科技(揭榜)项目(2021JBJZ007);四川大学华西医院1-3-5优秀学科项目(ZYJC21041);四川省科技计划项目(2022YFS0178).


Research advances in deep neural networks learning rate strategies
Author:
Affiliation:

College of Electrical Engineering,Sichuan University,Chengdu 610065,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    学习率(learning rate,LR)是深度神经网络(deep neural networks,DNNs)能够进行有效训练的重要超参数.然而,学习率的调整在DNNs训练过程中仍存在诸多困难与挑战,即使以恒定的学习率选择为目标,为训练DNNs选择一个最优的恒定初始学习率也非易事.动态学习率涉及到训练过程的不同阶段,需对学习率进行多步调整以达到高精确度和快速收敛的目的:调整过程中学习率过小可能会导致模型收敛缓慢或陷入局部最优值;而学习率过大则会阻碍收敛,造成震荡发散.对此,综述了近年来基于深度学习算法的学习率研究进展,并对分段衰减学习率、平滑衰减学习率、循环学习率、具有热启动的学习率4种类型的学习率簇在几个常见数据集上的性能表现进行测试分析和对比研究,包括收敛速度、鲁棒性和均值方差等.最后总结全文,并对该领域仍存在的问题以及未来的研究趋势进行展望.

    Abstract:

    Learning rate(LR) is an important hyperparameter for effective training of deep neural networks(DNNs). However, there are still many difficulties and challenges in tuning the learning rate during the training of DNNs, and it is not easy to choose an optimal constant initial learning rate for training DNNs even with the goal of constant learning rate selection. The dynamic learning rate involves multi-step adjustment of the learning rate at different stages of the training process to achieve high accuracy and fast convergence: too small a learning rate in the adjustment process may cause the model to converge slowly or fall into a local optimum; while too large a learning rate may hinder convergence and cause oscillation and scattering. Therefore, we summarize the progress of learning rate research based on deep learning algorithms in recent years, and test and compare the performance of four types of learning rate clusters, including segmented decay learning rate, smooth decay learning rate, cyclic learning rate, and learning rate with hot start, on several common data sets, including convergence speed, robustness, and mean variance, etc. Finally, we summarize the full paper and discuss the remaining problems and future research trends in this field. Finally, we conclude the paper and give an outlook on the remaining problems and future research trends in this field.

    参考文献
    相似文献
    引证文献
引用本文

刘云飞,张俊然.深度神经网络学习率策略研究进展[J].控制与决策,2023,38(9):2444-2460

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-09-04
  • 出版日期: 2023-09-20
文章二维码