School of Aeronautics and Astronautics, Zhejiang University
The Key Project of Joint Fund of Ministry of Education for Equipment Pre-research（6141A02011803）
Since the poor maneuverability of Flapping Wing Micro Aerial Vehicle (FWMAV), a deep reinforcement learning (DRL) based local path planning method (IL-PPO2) was proposed with the assistant of demonstration learning in unknown environment. Firstly, due to the limited visual angle of stereo camera on FWMAV, a “Heart” algorithm was proposed to reduce the requirement for control accuracy and meanwhile improve robustness. Secondly, according to the characteristics of Heart algorithm, a U trap avoidance framework was developed. Finally, with the help of demonstration learning, a DRL based local path planning method was put forward, which was realized with the combination of Heart algorithm and local planner. According to the simulation results, compared to TD3fD DRL method, the path planning efficiency and success rate of IL-PPO2 is higher than TD3fD with shorter training time. Besides, compared to Dynamic Window Approach (DWA), the success rate of IL-PPO2 is improved, and the path smoothness is promoted considering the integration of Heart algorithm.