In view of the bottlenecks of traditional manual detection in the scenario of aircraft skin cover detection, such as low operation efficiency and strict detection timeliness constraints, the existing research mostly focuses on the technical solutions of multi-UAV collaborative operation, among which multi-UAV cooperative mission planning (MCMP) for aircraft skin cover detection is the problem model describing the collaborative detection of multiple UAVs, and the current algorithms mostly use heuristic algorithms. However, the speed of the solution and the quality of the solution cannot meet the actual requirements. To solve this problem, the MCMP problem is modeled as a capacitated vehicle routing problem (CVRP) with capacity constraints, and a two-stage deep reinforcement learning (TSDRL) solution model is proposed. In the first stage, the optimal number of UAVs is solved using a strategy network based on attention mechanism according to the number of nodes. In the second stage, a new encoder-decoder structure strategy network is designed to construct the path of each UAV. Trained with policy gradient methods, this model efficiently computes high-quality paths for each unmanned aerial vehicle. In order to solve the collision problem of the 3D environment, the RRT* algorithm is used to optimize the path to meet the collision constraints. Simulation results show that the proposed model is superior to the existing deep reinforcement learning methods and heuristic algorithms in terms of computational efficiency and solution quality, and the model has good generalization and can be applied to different models.