基于Q学习的多任务多目标粒子群优化算法
CSTR:
作者:
作者单位:

1. 北京工业大学 信息学部,北京 100124;2. 计算智能与智能系统北京市重点实验室,北京 100124

作者简介:

通讯作者:

E-mail: rechardhan@bjut.edu.cn.

中图分类号:

TP18

基金项目:

国家自然科学基金项目(62125301,61890930-5,61903010,62021003);国家重点研发计划项目(2022YFB3305800-5,2018YFC1900800-5);北京市教育委员会科技计划重点项目(KZ202110005009);北京高校卓越青年科学家项目(BJJWZYJH01201910005020);青年北京学者项目(037).


A Q-learning-based multi-task multi-objective particle swarm optimization algorithm
Author:
Affiliation:

1. Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China;2. Beijing Key Laboratory of Computational Intelligence and Intelligent System,Beijing 100124,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    多任务粒子群优化算法(multi-task particle swarm ptimization,MTPSO)通过知识迁移学习,具有快速收敛能力,广泛应用于求解多任务多目标优化问题.然而,MTPSO难以根据种群进化状态自适应调整优化过程,容易陷入局部最优,收敛性能较差.针对此问题,利用强化学习的自我进化与预测能力,提出一种基于Q学习的多任务多目标粒子群优化算法(QM2PSO).首先,设计粒子群参数动态更新方法,利用Q学习方法在线更新粒子群算法的惯性权重和加速度参数,提高当前粒子收敛到Pareto前沿的能力;其次,提出基于柯西分布的突变搜索策略,通过全局和局部交替搜索多任务最优解,避免算法陷入局部最优;最后,设计基于正向迁移准则的知识迁移方法,采用Q学习方法更新知识迁移率,改善知识负迁移现象.与已有经典算法的对比实验结果表明所提出的$QM^2$PSO算法具有更优越的收敛性.

    Abstract:

    The multi-task particle swarm optimization(MTPSO) algorithm is widely used to solve multi-task multi-objective problems due to its rapid convergence via knowledge transfer learning. However, the MTPSO has strong randomness and is lack of guideness during search process, which is prone to fall into local optimum and has poor convergence performance. This paper proposes a Q-learning-based multi-task multi-objective particle swarm optimization algorithm(QM2PSO) via using learning and prediction of reinforcement learning to guide optimization. Firstly, we design the adaptive parameter adjustment method, which can update the inertia weight and acceleration parameters online based on Q-learning to improve the convergence ability. Secondly, we develop a mutation strategy based on Cauchy distribution, which can balance exploration and exploitation to avoid falling into local optimum. Finally, we design a knowledge transfer method based on the positive transfer criterion via updating the knowledge transfer rate based on \emphQ-learning to avoid negative knowledge transfer. The comparative results demonstrate that the QM2PSO is superior to the existing algorithms on convergence performance.

    参考文献
    相似文献
    引证文献
引用本文

韩红桂,徐子昂,王晶晶.基于Q学习的多任务多目标粒子群优化算法[J].控制与决策,2023,38(11):3039-3047

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-10-08
  • 出版日期: 2023-11-20
文章二维码