•  
  •  
 

Journal of System Simulation

Abstract

Abstract: Aiming at the job shop scheduling in a dynamic environment, a dynamic scheduling algorithm based on an improved Q learning algorithm and dispatching rules is proposed. The state space of the dynamic scheduling algorithm is described with the concept of "the urgency of remaining tasks" and a reward function with the purpose of "the higher the slack, the higher the penalty" is disigned. In view of the problem that the greedy strategy will select the sub-optimal actions in the later stage of learning, the traditional Q learning algorithm is improved by introducing an action selection strategy based on the "softmax" function, which makes the improved Q learning algorithm more equal in the probability of selecting different actions in the early stage. The simulation results obtained from 6 different test instances show that the performance indicator of the scheduling algorithm is improved by an average of about 6.5% compared to the before and by about 38.3% and 38.9% respectively compared with the IPSO algorithm and PSO algorithm. The indicator is significantly better than conventional methods such as using a single dispatching rule and traditional optimization algorithms.

First Page

1247

Revised Date

2021-03-14

Last Page

1258

CLC

TB497

Recommended Citation

Yejian Zhao, Yanhong Wang, Jun Zhang, Hongxia Yu, Zhongda Tian. Application of Improved Q Learning Algorithm in Job Shop Scheduling Problem[J]. Journal of System Simulation, 2022, 34(6): 1247-1258.

Corresponding Author

Yanhong Wang,wangyh_sut@163.com

DOI

10.16182/j.issn1004731x.joss.21-0099

Share

COinS