This paper proposes a PID control algorithm based on enhanced proximal policy optimization (Enhanced PPO-PID) to address the trajectory tracking problem of multi-degree-of-freedom robotic arm system. First, the proposed method constructs a unified network architecture based on a shared feature extraction layer, which significantly reduces the number of model parameters while enhancing the synergistic optimization capability of the policy and value functions. The new architecture can improve the convergence speed and learning efficiency. Second, we propose a mechanism to dynamically adjust the number of training iterations based on reward, which results in achieving a balance between rapid convergence in the early stages of training and policy stability in the later stages. Third, we introduce a value function clipping to effectively smooth the learning curve by limiting the amplitude of a single update,thereby enhancing the robustness of training in high-variance environments. Finally, comparative experiments are conducted on the PandaReach-v3 robotic arm system to verify the superiority of the proposed method in trajectory tracking performance and robustness