不依赖初始容许控制的非对称约束零和博弈智能评判设计

doi:10.13195/j.kzyjc.2024.0356

首页 > 过刊浏览>2025年第40卷第4期 >1347-1356. DOI:10.13195/j.kzyjc.2024.0356

不依赖初始容许控制的非对称约束零和博弈智能评判设计
DOI:
                        10.13195/j.kzyjc.2024.0356
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP273
基金项目:国家自然科学基金项目(62222301, 61890930-5, 62021003)；新一代人工智能国家科技重大专项(2021ZD0112302, 2021ZD0112301)；北京市自然科学基金项目(JQ19013).

Intelligent critic design for asymmetric constrained zero-sum games without relying on initial admissible control

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

利用自适应评判控制方法研究具有非对称约束的连续时间零和博弈问题. 首先, 建立一种新颖的非二次型函数处理非对称约束问题, 以降低对控制矩阵的限制. 其次, 推导最优控制、最坏扰动, 以及Hamilton-Jacobi-Isaacs方程. 然后, 建立一种自适应评判控制方法以近似最优代价函数, 从而获得近似最优控制以及近似最坏扰动. 针对具有非对称约束的零和博弈问题, 提出一种新型评判学习准则来强化学习过程并消除对初始容许控制的依赖. 此外, 利用Lyapunov方法证明系统状态和评判网络权值近似误差的稳定性. 最后, 利用F-16战斗机和倒立摆两个实例验证所提算法的有效性. 同时, 给出传统学习算法下的仿真结果, 进一步说明所提新型学习准则的可行性.

Abstract:

The continuous-time zero-sum game problem with asymmetric constraints is investigated by making use of the adaptive critic control approach. Firstly, a novel nonquadratic function is established to deal with the asymmetric constraint problem, which relaxes the restriction on the control matrix. Secondly, the optimal control, the worst disturbance, and the Hamilton-Jacobi-Isaacs equation are derived. After that, an adaptive critic control method is constructed to approximate the optimal cost function, so as to obtain the near-optimal control as well as the near-worst disturbance. It is worth mentioning that for the zero-sum game problem with asymmetric constraints, this paper proposes an innovative critic learning criterion to strengthen the learning process and eliminate the dependence on the initial admissible control, which has not been considered in previous papers. Moreover, the stability of the system state and the weight estimation error of the critic network is proved using the Lyapunov method. Finally, the effectiveness of the proposed algorithm is verified by utilizing two examples, namely, the F-16 aircraft and the inverted pendulum. At the same time, for comparison, the simulation results under the traditional learning algorithm are provided to further illustrate the feasibility of the innovative learning criterion proposed.

参考文献

相似文献

引证文献

引用本文

李梦花,王鼎,赵明明,等.不依赖初始容许控制的非对称约束零和博弈智能评判设计[J].控制与决策,2025,40(4):1347-1356

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-04-01
最后修改日期:
录用日期:
在线发布日期: 2025-03-21
出版日期: 2025-04-20

首页

期刊简介

编委会

作者中心

精选专辑

品牌联动

引用本文

相关视频

分享

文章指标

历史

文章二维码