Journal of Jilin University (Information Science Edition) ›› 2021, Vol. 39 ›› Issue (2): 192-199.
Previous Articles Next Articles
Received:
Online:
Published:
Abstract: SAC(Soft Actor Critic) has the advantages of good robustness, strong exploration ability, and good agent generalization ability. It also has the disadvantage of lower training speed and unstable training process, to solve this problem, we propose the PER(Prioritized Experience Replay)-SAC algorithm by introducing priority experience sampling into the SAC algorithm. Thus the network prioritizes training samples with large error in estimation function and poor strategy performance, which improves the stability and convergence speed of the Agent training process. The experimental results show that the PER-SAC algorithm is better than SAC algorithm in training speed and stability.
Key words: deep reinforcement learning, Actor-Critic, maximum entropy, prioritized experience replay
CLC Number:
LIU Qingqiang, LIU Pengyun. Soft Actor Critic Reinforcement Learning with Prioritized Experience Replay[J].Journal of Jilin University (Information Science Edition), 2021, 39(2): 192-199.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://xuebao.jlu.edu.cn/xxb/EN/
http://xuebao.jlu.edu.cn/xxb/EN/Y2021/V39/I2/192
Cited