•  
  •  
 

Journal of System Simulation

Abstract

Abstract: Traditional optimization methods struggle with efficiency, while reinforcement learning approaches often yield low solution quality and high training costs. In response, this paper proposes an attention mechanism-based reinforcement learning method. A dynamic attention strategy network with multi-information fusion is designed to improve solution quality. A visibility-graph approach is employed to simplify threat zone constraints and speed up convergence, and a decoding sequence reordering mechanism is introduced for further performance optimization of the solution. The simulation results show that the method generates high-quality solutions within milliseconds, achieving total rewards that approach or even surpass those obtained by traditional solvers such as Ortools and PyVRP within several seconds to hundreds of seconds. The training efficiency is enhanced significantly, with the training time per epoch reducing from several hours to about 30 minutes.

First Page

360

Last Page

371

CLC

TP391.9; TP181

Recommended Citation

Yang Can, Chen Kai, Zhu Feng. Reinforcement Learning Based Method for UAV Team Orienteering Optimization under Multi-constraint Condition[J]. Journal of System Simulation, 2026, 38(2): 360-371.

Corresponding Author

Zhu Feng

DOI

10.16182/j.issn1004731x.joss.25-0595

Share

COinS