基于域随机化增强EfficientZero的无人机空战智能决策

doi:10.13195/j.kzyjc.2024.1443

首页 > 过刊浏览>2025年第40卷第11期 >3273-3286. DOI:10.13195/j.kzyjc.2024.1443

基于域随机化增强EfficientZero的无人机空战智能决策
DOI:
                        10.13195/j.kzyjc.2024.1443
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:V249.12
基金项目:

UAV air combat intelligent decision-making based on domain randomization enhanced EfficientZero

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

无人机智能空战是改变未来战争形式的颠覆性技术, 而深度强化学习是实现空战智能决策的重要技术范式. 虽然相关研究近年来取得了巨大进展, 但基于虚拟仿真交互设计的强化学习空战决策模型仍然存在学习效率低、泛化性能差的问题, 难以有效地实现在现实差异条件下的虚实迁移应用. 为增强空战智能决策模型从虚拟仿真环境到真实物理空间的适用性, 提出一种基于域随机化增强EfficientZero算法的近距空战机动智能决策模型设计方法. 该方法通过高效利用自我博弈产生的环境交互数据来习得智能决策能力, 并进一步采用域随机化技术提高模型的鲁棒性能. 仿真实验结果表明, 基于EfficientZero算法得到的智能决策模型可以高效地利用空战对抗样本数据, 避免自我博弈中常见的策略循环问题; 同时, 域随机化增强技术显著提升了强化学习空战智能决策模型的泛化性能, 有效增强了现实差异条件下决策模型的鲁棒性.

Abstract:

Unmanned aerial vehicle (UAV) intelligent air combat is a disruptive technology that can change the operational form of future warfare, with deep reinforcement learning serving as a crucial paradigm for achieving intelligent decision-making in air combat scenarios. Despite recent advancements in this area, reinforcement learning models designed based on virtual simulation interactions continue to struggle with low learning efficiency and poor generalization, making it difficult to achieve effective sim-to-real transfer. To enhance the transferability of air combat decision-making models from virtual simulation environments to real-world physical spaces, this paper proposes a close-range air combat maneuvering decision-making model based on the EfficientZero algorithm enhanced with domain randomization. The model efficiently learns decision-making strategies through self-play by leveraging interaction data generated in the process. Additionally, domain randomization techniques are employed to improve the model's robustness. Simulation results demonstrate that the intelligent decision-making model based on the EfficientZero algorithm efficiently utilizes air combat sample data, avoiding the common issue of strategy cycles in self-play. Furthermore, the use of domain randomization significantly improves the generalization performance and effectively enhances the robustness of the reinforcement learning model under reality gap conditions.

参考文献

相似文献

引证文献

引用本文

倪浩,章胜,刘福炜,等.基于域随机化增强EfficientZero的无人机空战智能决策[J].控制与决策,2025,40(11):3273-3286

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-12-13
最后修改日期:
录用日期:
在线发布日期: 2025-10-14
出版日期: 2025-11-20

首页

期刊简介

编委会

作者中心

精选专辑

品牌联动

引用本文

相关视频

分享

文章指标

历史

文章二维码