知识耦合分层值分解的多机协同突防行动策略规划方法
CSTR:
作者:
作者单位:

国防科技大学

作者简介:

通讯作者:

中图分类号:

TP273

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目)


Knowledge coupled hierarchical value decomposition of multi-aircraft cooperative penetration strategy planning method
Author:
Affiliation:

National University of Defense Technology

Fund Project:

The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    利用多智能体强化学习方法赋能异构多实体系统是分布式人工智能领域的前沿课题. 多机协同突防海上目标任务中异构多编队之间的高效协同是制胜的关键. 由于环境的部分可观导致多智能体强化学习方法的探索效率不高. 为此,本文提出了一种知识耦合分层值分解(HierArchical ValuE Decompostion, HAVED)的多机协同突防行动策略规划方法, 上层围绕多机编队占位规划展开资源调度, 下层围绕编队内任务规划展开目标分配. 对值分解基类算法利用加权算子对联合动作对应的损失进行加权、避免陷入局部最优, 着力提升多机多编队在对抗场景中突防策略的探索与学习效率. 为验证算法的有效性,以多机协同突防海上目标为典型任务场景,设计典型任务想定. 采用集中式训练-分布式执行范式,在墨子兵棋推演平台中进行了仿真实验,验证了该方法的有效性. 并对对抗过程进行了复盘分析,总结出3种典型行动策略. 项目地址:https://gitee.com/jrluo2049/haved.

    Abstract:

    Using multi-agent reinforcement learning method to enable heterogeneous multi-entity systems is a frontier issue in the field of distributed artificial intelligence. Efficient coordination among heterogeneous multi-formation is the key to victory in the task of multi-aircraft collaborative attack on sea targets. Due to the considerable part of the environment, the exploration efficiency of multi-agent reinforcement learning method is not high. In this paper, a knowledge-based HierArchical ValuE Decompostion (HAVED) strategy planning method for multi-aircraft collaborative attack operations is proposed. The upper layer focuses on multi-aircraft formation occupying planning for resource scheduling, and the lower layer focuses on task planning for target allocation. The value decomposition base class algorithm uses weighting operators to weight the loss corresponding to the joint action, avoids falling into the local optimal, and strives to improve the exploration and learning efficiency of multi-machine multi-formation penetration strategy in anti-scene. In order to verify the effectiveness of the algorithm, the typical task scenario of multi-aircraft collaborative penetration of sea targets is designed. Using the centralized training-distributed execution paradigm, simulation experiments are carried out in the Mozi military chess simulation platform, and the effectiveness of the proposed method is verified. The process of confrontation is analyzed and three typical action strategies are summarized. The address of the project: https://gitee.com/jrluo2049/haved.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-05-01
  • 最后修改日期:2024-10-27
  • 录用日期:2024-09-14
  • 在线发布日期: 2024-09-18
  • 出版日期:
文章二维码