Journal of System Simulation
Abstract
Abstract: Aiming at the problems of poor convergence and invalid exploration when UAVs perform path planning in complex environments, an improved deep deterministic policy gradient(DDPG) algorithm is proposed. Using a dual experience pooling mechanism to store success and failure experiences separately, the algorithm is able to use the success experience to strengthen the strategy optimization and learn from the failure experience to avoid the wrong path; an APF method is introduced to add a bootstrap term to the planning, which is combined with the exploration of noisy actions in a randomized sampling process to dynamically integrate the selected actions; multi-objective optimization of path planning is achieved by designing combinatorial reward functions using direction, distance, obstacle avoidance and time reward functions and solving the reward sparsity problem. Experiments show that the proposed algorithm can significantly improve the reward and success rate and reach convergence in a shorter time.
Recommended Citation
Zhang, Sen and Dai, Qiangqiang
(2025)
"UAV Path Planning Based on Improved Deep Deterministic Policy Gradients,"
Journal of System Simulation: Vol. 37:
Iss.
4, Article 4.
DOI: 10.16182/j.issn1004731x.joss.23-1524
Available at:
https://dc-china-simulation.researchcommons.org/journal/vol37/iss4/4
First Page
875
Last Page
881
CLC
TP273
Recommended Citation
Zhang Sen, Dai Qiangqiang. UAV Path Planning Based on Improved Deep Deterministic Policy Gradients[J]. Journal of System Simulation, 2025, 37(4): 875-881.
DOI
10.16182/j.issn1004731x.joss.23-1524
Included in
Artificial Intelligence and Robotics Commons, Computer Engineering Commons, Numerical Analysis and Scientific Computing Commons, Operations Research, Systems Engineering and Industrial Engineering Commons, Systems Science Commons