Abstract:To improve logistics efficiency, we examines a vehicles–drone cooperative routing problem with workload balance. A mixed-integer linear programming model is formulated to minimize both cost and workload imbalance of vehicles–drone. A model and graph-based reinforcement learning-driven multi-objective method is proposed to solve the routing problem. First, the method incorporates a hybrid strategy- based population initialization approach and customizes local search operators to efficiently explore the solution space. Second, a pareto local search algorithm incorporating reinforcement learning is proposed as a local search approach for the problem, thereby enhancing the multi-objective method's local search ability. The feature extraction mechanism captures spatial patterns in routing, and the policy method employs multi-step virtual trajectory to enhance state information and sample efficiency. Finally, through parameter calibration and comparative experiments, the validity of proposals are confirmed, demonstrating that the algorithm outperforms the CPLEX and several state-of-the-art competitors in proposal bi-objective model.