•  
  •  
 

Journal of System Simulation

Abstract

Abstract: Current multi-agent reinforcement learning algorithms suffer from low efficiency in utilizing experience data and difficulties in setting appropriate learning rates. To address these issues, this paper proposed a BiGRU multi-agent PPO with priority sampling and dynamic learning rate. The algorithm incorporated a BiGRU network to enhance the policy network's ability to model temporal information. A priority partial sampling mechanism was introduced to improve the utilization efficiency of high-value experience data. Additionally, an improved Adam optimizer with dynamic learning rate adjustment was employed to address the challenge of learning rate configuration. Simulation experiment results demonstrate that the algorithm significantly enhances convergence speed, stability, and combat win rate, offering a novel optimization scheme for multi-agent air combat decision-making.

First Page

447

Last Page

459

CLC

TP391.9

Recommended Citation

Ding Zhengkun, Liu Jiaqi, Xu Junzheng, et al. Intelligent Air Combat Decision-making Method Based on BiGRU and Priority Dynamic Sampling[J]. Journal of System Simulation, 2026, 38(2): 447-459.

Corresponding Author

Xu Yuezhu

DOI

10.16182/j.issn1004731x.joss.25-0472

Share

COinS