Abstract:The allocation of jamming resources is an important aspect of cognitive electronic warfare, aimed at achieving maximum jamming effectiveness through the reasonable allocation of limited jamming resources. This paper addresses the challenges faced by multi-agent reinforcement learning (MARL) algorithms in scenarios where UAV swarms collaboratively interfere with multiple mobile communication targets under constrained communication and navigation conditions, particularly due to the expansive state space and non-stationary environment leading to suboptimal decision-making performance. We propose an Attention-Pretrained Self-Encoder (APSE) which serves as a preprocessing unit for MARL algorithms, enabling effective feature extraction and dimensionality reduction of environmental states. Additionally, we adopt a centralized training and distributed execution paradigm to mitigate the impact of environmental non-stationarity on algorithmic decision performance. The experimental results in the UAV swarm collaborative interference simulation environment established in this study demonstrate a significant improvement in average rewards and interference resource allocation efficiency with the integration of APSE into the MARL algorithm. Among them, MAPPO-APSE exhibits the best performance across all metrics, reducing jamming resource consumption by 20% while maintaining a longer effective jamming duty cycle compared to MAPPO.