基于优先级经验回放的 SAC 强化学习算法

Journal of Jilin University (Information Science Edition) ›› 2021, Vol. 39 ›› Issue (2): 192-199.

Soft Actor Critic Reinforcement Learning with Prioritized Experience Replay

School of Electrical Engineering and Information, Northeast Petroleum University, Daqing 163318, China

Received:2020-06-27 Online:2021-04-19 Published:2021-04-27

Abstract

Abstract: SAC(Soft Actor Critic) has the advantages of good robustness, strong exploration ability, and good agent generalization ability. It also has the disadvantage of lower training speed and unstable training process, to solve this problem, we propose the PER(Prioritized Experience Replay)-SAC algorithm by introducing priority experience sampling into the SAC algorithm. Thus the network prioritizes training samples with large error in estimation function and poor strategy performance, which improves the stability and convergence speed of the Agent training process. The experimental results show that the PER-SAC algorithm is better than SAC algorithm in training speed and stability.

Key words: deep reinforcement learning, Actor-Critic, maximum entropy, prioritized experience replay

CLC Number:

TP273

LIU Qingqiang, LIU Pengyun. Soft Actor Critic Reinforcement Learning with Prioritized Experience Replay[J].Journal of Jilin University (Information Science Edition), 2021, 39(2): 192-199.

References

Metrics

Viewed

Full text

871

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	0	0	0	871

From	Others	local

Times	307	564
Rate	35%	65%

Abstract

1733

Just accepted	Online first	Issue

0	0	1733

From	Others	local

Times	1730	3
Rate	100%	0%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared

Soft Actor Critic Reinforcement Learning with Prioritized Experience Replay

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 1

Metrics

Comments

Recommended 10