Journal of Jilin University Science Edition ›› 2025, Vol. 63 ›› Issue (4): 1105-1116.

Previous Articles     Next Articles

Emergency Resource Allocation Strategy Based on DBSDER-QL Algorithm

YANG Hao1, ZHANG Chijun1,2, ZHANG Xinwei3   

  1. 1. School of Computer Science and Technology, Changchun University of Science and Technology, Changchun 130022, China;
    2. International Business School, Guangdong University of Finance & Economics, Guangzhou 510320, China; 3. Student Affairs Office, Changchun University, Changchun 130022, China
  • Received:2025-02-24 Online:2025-07-26 Published:2025-07-26

Abstract: Aiming at the problem of emergency resource allocation for natural disasters, we proposed a Q-learning algorithm based on dynamic Boltzmann Softmax (DBS) and dynamic exploration rate (DER) (DBSDER-QL).  Firstly, the DBS strategy was used to dynamically adjust the weights of action values, promoting stable convergence of the algorithm and solving the problem of excessive of  the maximum operator. Secondly, the DER strategy was used to improve convergency and stability of the algorithm, solving the problem of the fixed exploration rate Q-learning algorithm not fully converging to the optimal strategy in the later stage of  training. Finally,  the effectiveness of the DBS and DER strategies was verified by ablation experiments. Compared with 
 dynamic programming, the greedy algorithm, and traditional Q-learning algorithm, the experimental results show  that DBSDER-QL algorithm is significantly better than traditional methods in terms of total cost and computational efficiency, showing higher applicability and effectiveness.

Key words: resource allocation, reinforcement learning, Q-learning algorithm, dynamic exploration rate, dynamic Boltzmann Softmax

CLC Number: 

  • TP391