基于优先经验回放的生成式SAC算法及其应用

Abstract

Abstract: Aiming at the problem that the conventional soft actor-critic (SAC) algorithm lacked exploration capability and state representation in complex environments, we proposed an improved soft actor-critic (ISAC) algorithm. Firstly, the ISAC algorithm introduced a prioritized experience replay (PER) mechanism, which dynamically evaluated the priority of experience samples by using the temporal differential errors, thereby enhancing the utilization of crucial experiences and improving learning efficiency of the algorithm. Secondly, the algorithm integrated generative Transformer architecture into the actor network to strengthen its ability to dynamically capture state features, thereby significantly improving its performance in complex optimization tasks. Finally, we conducted an application experiment on the dynamic scheduling optimization problem of university logistics staff. The experimental results show that, compared with the original SAC algorithm and the classic deep Q-network (DQN) algorithm, the proposed ISAC algorithm has smaller errors in dynamically fitting human resource demand, which effectively demonstrates its
advantages and practicality in practical applications.

Key words: deep reinforcement learning, soft actor-critic algorithm, prioritized experience replay, Transformer architecture, , logistics management

CLC Number:

TP181

ZHANG Wei, LI Yujun, XIE Wenwen, XU Yunjia, SUN Geng. Prioritized Experience Replay-Based Generative SAC Algorithm and Its Application[J].Journal of Jilin University Science Edition, 2025, 63(6): 1713-1722.

[1]	ZHANG Shengui. Nontrivial Solutions for Neumann Problem of Kirchhoff Equation with Variable Exponent [J]. Journal of Jilin University Science Edition, 2025, 63(6): 1549-1556.
[2]	SONG Yuanfeng, WANG Yaokun, ZHANG Hongwei, GAO Yufeng. l₁ Norm of Coherence of Bell-Diagonal States Based on Unitary Matrices [J]. Journal of Jilin University Science Edition, 2025, 63(6): 1598-1602.
[3]	YAN Jiachen, XIAO Yonghao, WANG Lingfeng, XIONG Min. A Job Runtime Prediction Algorithm Considering Locality [J]. Journal of Jilin University Science Edition, 2025, 63(6): 1685-1693.
[4]	JIANG Cheng, GUO Xiangkun. Analysis Method for Solving TSP Problem of Dual Population Ant Colony Algorithm Based on Path Contribution Evaluation [J]. Journal of Jilin University Science Edition, 2025, 63(6): 1694-1700.
[5]	GUO Mingrui, ZHAO Zhiyan, LI Jian, WANG Yan, LI Aiwu, XU Ying, YU Yanhao. High Infrared Absorption SiC Surface Prepared by Femtosecond Laser [J]. Journal of Jilin University Science Edition, 2025, 63(6): 1767-1774.
[6]	XU Hua, YI Wensuo, YANG Yuxin. Dual Channel Mach-Zehnder Fiber Coherent Vibration Sensor [J]. Journal of Jilin University Science Edition, 2025, 63(6): 1775-1783.
[7]	JIANG Chunxu, WANG Zuocheng, ZHAO Hongdi, WU Jing, NIE Yaqi, SUN Guanjun, ZHANG Xuejiao. Density Functional Theory of 2α-Ala→Mg²⁺Scavenging ^.OH Reaction in Aqueous Phase [J]. Journal of Jilin University Science Edition, 2025, 63(6): 1784-1794.
[8]	CHANG Yu, LIU Zuojia, YE Yuxin, LIU Siqi, PAN Lihua. Detection of Severe Acute Respiratory Syndrome Coronavirus 2 Antigen by Time-Resolved Fluorescence Immunochromatography [J]. Journal of Jilin University Science Edition, 2025, 63(6): 1795-1802.
[9]	WANG Hanbo, GUO Ping, LI Yueming, YE Hongsheng, CHEN Zhilu, HAO Anjing, GONG Shangyu, HAN Yabing, CHEN Weiwei. Laboratory Simulation on Effects of Freeze Thaw and Microplastics on Transformation of Arsenic Speciation in Paddy Soil [J]. Journal of Jilin University Science Edition, 2025, 63(6): 1803-1814.
[10]	JIA Shanshan, MENG Haixia. Existence of Internal Optimal Control for a Class of Nonlocal Nonlinear Fractional Order Equations [J]. Journal of Jilin University Science Edition, 2025, 63(5): 1231-1238.
[11]	WU Lin, JIAO Jianjun. Analysis of Switching Dynamics Models of Systems with Pulse Time-Delay Harvesting and Birth [J]. Journal of Jilin University Science Edition, 2025, 63(5): 1269-1275.
[12]	WANG Xuan, SHI Huixia. Long-Time Dynamical Behavior of Solutions for Nonlocal Nonclassical Diffusion Equation with Time-Dependent Memory Kernels [J]. Journal of Jilin University Science Edition, 2025, 63(5): 1276-1292.
[13]	LIU Pei, XU Ping, XIAO Nannan, WANG Chunjie. Additive Interquantile Regression with Partially Time-Dependent Covariates under Right Censored Data [J]. Journal of Jilin University Science Edition, 2025, 63(5): 1302-1312.
[14]	LI Xijin, WANG Xiangren, LIU Jinshi. Fluctuation Prediction Model Based on Recurrent Neural Network and Attention Mechanism [J]. Journal of Jilin University Science Edition, 2025, 63(5): 1397-1403.
[15]	XUE Lei, WANG Tianfang. K-means Algorithm Based on Adaptive Dynamic Feature Weighting [J]. Journal of Jilin University Science Edition, 2025, 63(5): 1404-1410.

Prioritized Experience Replay-Based Generative SAC Algorithm and Its Application

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0