基于奇异摄动的复杂工业过程强化学习运行优化控制
CSTR:
作者:
作者单位:

中国矿业大学

作者简介:

通讯作者:

中图分类号:

TP29

基金项目:

国家自然科学基金项目(62073327, 62273350, 62403467, 62403466), 江苏省基础研究计划自然科学基金项目(BK20241635), 国家资助博士后研究人员计划(GZB20240827), 江苏省卓越博士后计划(2024ZB604, 2024ZB835), 江苏省研究生科研与实践创新计划项目(KYCX24 2774), 中国矿业大学研究生创新计划项目(2024WLJCRCZL139).


Reinforcement Learning-Based Optimal Operational Control of Complex Industrial Processes Using Singular Perturbations
Author:
Affiliation:

China University of Mining and Technology

Fund Project:

The National Natural Science Foundation of China (62073327, 62273350, 62403467, 62403466),Natural Science Foundation of Jiangsu Province under Grant BK20241635,The Postdoctoral Fellowship Program of CPSF under Grant Number GZB20240827,Jiangsu Funding Program for Excellent Postdoctoral Talent under Grant (2024ZB604, 2024ZB835),Postgraduate Research & Practice Innowation Program of Jiangsu Prowince(KYCX24_2774),The Graduate Innovation Program of China University of Mining and Technology(2024WLJCRCZL139)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    复杂工业过程往往存在快、慢耦合动态特性, 应用常规的串级设计思路难以保障整体运行优化性能,而现有整体设计运行优化控制的方法容易发生``维数灾""和``病态数值""问题. 鉴于此, 本文聚焦一类设备单元和运行过程模型参数皆未知的复杂工业过程, 提出一种基于奇异摄动的快慢并行强化学习组合非串级运行优化控制方法. 首先, 引入收敛因子将复杂工业过程运行优化控制问题建模为双时间尺度系统的非串级最优控制问题; 其次, 使用奇异摄动理论将原始最优问题分解为降阶快子系统最优调节和慢子系统最优设定值跟踪问题; 然后, 在强化学习的框架下, 设计数据驱动的迭代算法学习快、慢子系统最优控制器, 进而构造不依赖于系统模型的组合最优控制策略. 与现有方法相比, 本文所提运行优化算法不仅能解决设备单元模型参数未知问题, 还能实现对期望运行指标的零误差渐近跟踪. 最后, 通过赤铁矿混合选别浓密过程对比实验验证所提方法的有效性和先进性.

    Abstract:

    Complex industrial processes are often subject to fast and slow coupling dynamic characteristics. Using the traditional cascade design approach cannot ensure the optimal operation performance of the whole system, while the existing integrated design methods tends to results in high dimensionality and ill-conditioned numerics. For a class of complex industrial processes where both unit device and operational process are unknown, We propose a parallel fast and slow reinforcement learning-based composite non-cascade control approach using singular perturbations. Firstly, the optimal operational control problem of complex industrial processes is modeled as a non-cascade optimal control problem of a two-time-scale system introducing a convergence factor. Secondly, the original optimal control problem is decomposed into an optimal regulation problem of fast subsystem and an optimal set-point tracking problem of slow subsystem via singular perturbation theory. Thirdly, we design a data-driven iterative algorithm to learn the optimal controllers of both fast and slow subsystems in the framework of reinforcement learning, and then we construct a composite optimal control strategy that is independent of model parameters. Compared to the existing methods, our proposed approach does not require knowledge of model parameters of device layer and achieves zero-error asymptotic tracking of desired operational index. Finally, we validate the effectiveness and superiority of the proposed method through experimental results conducted on mixed separation thickening process.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-06-12
  • 最后修改日期:2024-10-24
  • 录用日期:2024-10-26
  • 在线发布日期: 2024-11-07
  • 出版日期:
文章二维码