Journal of System Simulation

Robot Path Planning by Reinforcement Learning Based on SAC3Q-HDM

Dequan Li, School of Artificial Intelligence, Anhui University of Science and Technology, Hefei 231131, China; State Key Laboratory of Digital Intelligent Technology for Unmanned Coal Mining, Anhui University of Science and Technology, Huainan 232001, China
Wan Xiong, School of Artificial Intelligence, Anhui University of Science and Technology, Hefei 231131, China

Abstract

Abstract: To address the issues of overestimated and underestimated biases, low sample utilization rate, and the inability to balance exploration and exploitation in reinforcement learning for path planning, an improved SAC method was proposed. The size balance of entropy was explored and utilized through adaptive temperature coefficient adjustment; on the basis of the SAC framework, a triple Critic architecture was introduced to dynamically weight and fuse the minimum and average values through Qvalue uncertainty, balancing overestimated and underestimated biases. A mixed dynamic sampling experience replay buffer was designed; experience data was partitioned based on reward thresholds; sampling ratios were dynamically adjusted to achieve progressive learning from core strategies to comprehensive generalization. A hierarchical heuristic reward function was designed to guide robots to balance the multi-objective needs of approaching goals and avoiding obstacles in tasks. The simulation experiment results demonstrate that the improved algorithm outperforms in several aspects such as path length, planning time, and success rate, enhancing both efficiency and robustness in path planning.

Recommended Citation

Li, Dequan and Xiong, Wan (2026) "Robot Path Planning by Reinforcement Learning Based on SAC3Q-HDM," Journal of System Simulation: Vol. 38: Iss. 3, Article 12.
DOI: 10.16182/j.issn1004731x.joss.25-0399
Available at: https://dc-china-simulation.researchcommons.org/journal/vol38/iss3/12

First Page

714

Last Page

724

CLC

TP242

Recommended Citation

Li Dequan, Xiong Wan. Robot Path Planning by Reinforcement Learning Based on SAC3Q-HDM[J]. Journal of System Simulation, 2026, 38(3): 714-724.

DOI

10.16182/j.issn1004731x.joss.25-0399

Download

Included in

Artificial Intelligence and Robotics Commons, Computer Engineering Commons, Numerical Analysis and Scientific Computing Commons, Operations Research, Systems Engineering and Industrial Engineering Commons, Systems Science Commons

COinS

Journal of System Simulation

Robot Path Planning by Reinforcement Learning Based on SAC3Q-HDM

Authors

Abstract

Recommended Citation

First Page

Last Page

CLC

Recommended Citation

DOI

Included in

Share

Search