基于多动作并行异步深度确定性策略梯度的选矿运行指标决策方法
作者:
作者单位:

东北大学

作者简介:

通讯作者:

中图分类号:

TP1

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目)


A Decision Method of Mineral Processing Operation Index Based on Multi-actions Parallel Asynchronous Depth Deterministic Strategy Gradient
Author:
Affiliation:

Northeastern University

Fund Project:

The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了解决深度确定性策略梯度算法探索能力不足的问题,本文提出了一种多动作并行异步深度确定性策略梯度(Multi-actions Parallel Asynchronous DDPG)算法,用于在强化学习中具有确定性策略的连续控制。该算法使用多个actor网络,进行不同的初始化和训练,大大的增加了不同程度的探索,同时通过扩展具有确定性策略梯度结构的评论家体系,揭示了探索与利用之间的关系。该算法使用多个DDPG代替了单一DDPG,可以减轻一个DDPG表现不佳的影响,提高了学习稳定性,同时通过使用并行异步结构,提高了数据利用效率,加快了网络收敛速度,最后actor通过影响critic的更新而得到更好的策略梯度。结果证明了在选矿操作任务中,MPADDPG相对于DDPG算法的性能提升。

    Abstract:

    In order to solve the problem of insufficient exploration ability of deep deterministic strategy gradient algorithm, this paper proposes a Multi-actions Parallel Asynchronous Deep Deterministic Policy Gradient algorithm for continuous control of deterministic strategy in reinforcement learning.This algorithm uses multiple actor networks for different initialization and training, which greatly increases the exploration to different degrees, at the same time, the relationship between exploration and utilization is revealed by extending the critical architecture of deterministic selection policy. This algorithm uses multiple DDPGS instead of a single DDPG, which can alleviate the poor performance of one DDPG, improve the learning stability, and improve the data utilization efficiency and speed up the network convergence by using parallel asynchronous structure, finally, actor gets better strategy gradient by influencing critical update.The results show that the performance of MPADDPG algorithm is improved compared with DDPG algorithm.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2020-07-30
  • 最后修改日期:2022-04-21
  • 录用日期:2021-04-25
  • 在线发布日期: 2021-05-15
  • 出版日期: