知识耦合分层值分解的多机协同突防行动策略规划方法
CSTR:
作者:
作者单位:

国防科技大学 智能科学学院,长沙 410073

作者简介:

通讯作者:

E-mail: chenjing001@vip.sina.com.

中图分类号:

TP273

基金项目:

国家自然科学基金项目(61806212);湖南省研究生科研创新项目(CX20210011).


Knowledge coupled hierarchical value decomposition of multi-aircraft cooperative penetration strategy planning method
Author:
Affiliation:

College of Intelligence Science and Technology,National University of Defense Technology,Changsha 410073,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    利用多智能体强化学习方法赋能异构多实体系统是分布式人工智能领域的前沿课题.多机协同突防海上目标任务中异构多编队之间的高效协同是制胜的关键,由于环境的部分可观导致多智能体强化学习方法的探索效率不高.为此,提出一种知识耦合分层值分解(hierarchical value decompostion,HAVED)的多机协同突防行动策略规划方法,上层围绕多机编队间(intra-team)占位规划展开资源调度,下层围绕编队内(inner-team)任务规划展开目标分配.对值分解基类算法利用加权算子对联合动作对应的损失进行加权,避免陷入局部最优,着力提升多机多编队在对抗场景中突防策略的探索与学习效率.为验证算法的有效性,以多机协同突防海上目标为典型任务场景,设计典型任务想定.采用集中式训练-分布式执行范式,在墨子兵棋推演平台上进行仿真实验,对多类值分解方法进行对比分析,以验证所提出方法的有效性.最后通过对推演对抗过程数据进行复盘分析,总结出智能体涌现出的3种典型行动策略.

    Abstract:

    Using multi-agent reinforcement learning methods to enable heterogeneous multi-entity systems is a frontier issue in the field of distributed artificial intelligence. Efficient coordination among heterogeneous multi-formation is the key to winning in the task of multi-aircraft collaborative penetration of sea targets. The exploration efficiency of multi-agent reinforcement learning is not high because of the partial observability of the environment. Therefore, this paper proposes a knowledge-based hierarchical value decompostion(HAVED) strategy planning method for multi-aircraft coordinated penetration operations. The upper layer starts resource scheduling around intra-team occupying planning. The lower level is organized around inner-team task planning. The value decomposition base algorithm uses the weighting operator to weight the loss corresponding to the joint action to avoid falling into the local optimal, so as to improve the exploration and learning efficiency of multi-aircraft and multi-formation penetration strategies in confrontation scenarios. In order to verify the effectiveness of the proposed algorithm, the typical task scenario of multi-aircraft collaborative penetration of sea targets is designed. By using the centralized training-distributed execution paradigm, simulation experiments are carried out on the Mozi Wargame inference platform, and the multi-class value decomposition methods are compared and analyzed, and the effectiveness of the proposed method is verified. Finally, by analyzing the process data of confrontation wargaming, three typical action strategies emerged by the agents are summarized.

    参考文献
    相似文献
    引证文献
引用本文

罗俊仁,张万鹏,苏炯铭,等.知识耦合分层值分解的多机协同突防行动策略规划方法[J].控制与决策,2025,40(1):137-147

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-12-12
  • 出版日期: 2025-01-20
文章二维码