•  
  •  
 

Journal of System Simulation

Abstract

Abstract: Aiming at the problems of poor convergence and invalid exploration when UAVs perform path planning in complex environments, an improved deep deterministic policy gradient(DDPG) algorithm is proposed. Using a dual experience pooling mechanism to store success and failure experiences separately, the algorithm is able to use the success experience to strengthen the strategy optimization and learn from the failure experience to avoid the wrong path; an APF method is introduced to add a bootstrap term to the planning, which is combined with the exploration of noisy actions in a randomized sampling process to dynamically integrate the selected actions; multi-objective optimization of path planning is achieved by designing combinatorial reward functions using direction, distance, obstacle avoidance and time reward functions and solving the reward sparsity problem. Experiments show that the proposed algorithm can significantly improve the reward and success rate and reach convergence in a shorter time.

First Page

875

Last Page

881

CLC

TP273

Recommended Citation

Zhang Sen, Dai Qiangqiang. UAV Path Planning Based on Improved Deep Deterministic Policy Gradients[J]. Journal of System Simulation, 2025, 37(4): 875-881.

DOI

10.16182/j.issn1004731x.joss.23-1524

Share

COinS