Journal of System Simulation
Abstract
Abstract: Current multi-agent reinforcement learning algorithms suffer from low efficiency in utilizing experience data and difficulties in setting appropriate learning rates. To address these issues, this paper proposed a BiGRU multi-agent PPO with priority sampling and dynamic learning rate. The algorithm incorporated a BiGRU network to enhance the policy network's ability to model temporal information. A priority partial sampling mechanism was introduced to improve the utilization efficiency of high-value experience data. Additionally, an improved Adam optimizer with dynamic learning rate adjustment was employed to address the challenge of learning rate configuration. Simulation experiment results demonstrate that the algorithm significantly enhances convergence speed, stability, and combat win rate, offering a novel optimization scheme for multi-agent air combat decision-making.
Recommended Citation
Ding, Zhengkun; Liu, Jiaqi; Xu, Junzheng; Xu, Yuezhu; and Wang, Xingmei
(2026)
"Intelligent Air Combat Decision-making Method Based on BiGRU and Priority Dynamic Sampling,"
Journal of System Simulation: Vol. 38:
Iss.
2, Article 15.
DOI: 10.16182/j.issn1004731x.joss.25-0472
Available at:
https://dc-china-simulation.researchcommons.org/journal/vol38/iss2/15
First Page
447
Last Page
459
CLC
TP391.9
Recommended Citation
Ding Zhengkun, Liu Jiaqi, Xu Junzheng, et al. Intelligent Air Combat Decision-making Method Based on BiGRU and Priority Dynamic Sampling[J]. Journal of System Simulation, 2026, 38(2): 447-459.
DOI
10.16182/j.issn1004731x.joss.25-0472
Included in
Artificial Intelligence and Robotics Commons, Computer Engineering Commons, Numerical Analysis and Scientific Computing Commons, Operations Research, Systems Engineering and Industrial Engineering Commons, Systems Science Commons