基于李雅普诺夫优化和深度强化学习的多任务端边迁移
CSTR:
作者:
作者单位:

1. 中国科学院沈阳自动化研究所 机器人学国家重点实验室,沈阳 110016;2. 中国科学院 网络化控制系统重点实验室,沈阳 110016;3. 中国科学院 机器人与智能制造创新研究院,沈阳 110169;4. 中国科学院大学,北京 100049

作者简介:

通讯作者:

E-mail: xuchi@sia.cn.

中图分类号:

TP39

基金项目:

国家自然科学基金项目(92267108,62173322,62133014,61972389);辽宁省科技计划项目(2023JH3/ 10200004,2023JH3/10200006,2022JH25/10100005);中国科学院青年创新促进会项目(2019202,2020207, Y2021062).


Multi-task end-edge offloading based on Lyapunov optimization and deep reinforcement learning
Author:
Affiliation:

1. State Key Laboratory of Robotics,Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China;2. Key Laboratory of Networked Control Systems,Chinese Academy of Sciences,Shenyang 110016,China;3. Institutes for Robotics & Intelligent Manufacturing,Chinese Academy of Sciences,Shenyang 110169,China;4. University of Chinese Academy of Sciences,Beijing 100049,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对多终端、多边缘服务器场景下异构工业任务的端边协同处理问题,提出一种基于李雅普诺夫优化和深度强化学习的多任务端边迁移算法.首先,以联合优化任务迁移决策、迁移比例和传输功率为目标,充分考虑计算频率、传输功率、长期能耗和任务截止期等约束,构建系统长期平均开销最小化问题;由于问题中长期目标及约束中变量在不同时隙相互耦合,难以求解,基于李雅普诺夫优化理论,将长期平均开销最小化问题解耦为独立时隙的策略优化问题;通过马尔可夫决策过程建模,并采用双层竞争深度神经网络架构,提出基于深度强化学习的多任务迁移算法.实验结果表明,所提算法能够稳定收敛,并在长期能耗约束和任务截止期要求下有效降低系统长期平均开销.

    Abstract:

    To enable collaborative processing of heterogeneous industrial tasks in the scenario with multiple devices and multiple edge servers, this paper proposes a multi-task end-edge offloading algorithm based on Lyapunov optimization and deep reinforcement learning. First, to jointly optimize task offloading decision, offloading ratio and transmit power, a long-term average system overhead minimization problem is formulated with full consideration of computing frequency, transmission power, long-term energy consumption and task deadline. As variables are coupled among different time slots in the long-term objective and constraints, the problem is difficult to solve. Thus, the long-term average system overhead minimization problem is decoupled into some independent time-slot optimization problems based on the Lyapunov optimization theory. By Markov decision process modelling and employing a double dueling deep neural network architecture, a deep reinforcement learning-based multi-task offloading algorithm is proposed. Experiments show that the proposed algorithm can converge stably, and can effectively reduce the long-term average system overhead under long-term energy consumption constraints and task deadline requirements.

    参考文献
    相似文献
    引证文献
引用本文

许驰,唐紫萱,金曦,等.基于李雅普诺夫优化和深度强化学习的多任务端边迁移[J].控制与决策,2024,39(7):2457-2464

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-06-06
  • 出版日期: 2024-07-20
文章二维码