Journal of Jilin University Science Edition ›› 2025, Vol. 63 ›› Issue (3): 885-0894.

Previous Articles     Next Articles

Multi-actor Deterministic Policy Gradient Algorithm Based on Progressive k-Means Clustering

LIU Quan1,2, LIU Xiaosong2, WU Guangjun2, LIU Yuhan3   

  1. 1. School of Computer Science and Technology, Kashi University, Kashi 844000, Xinjiang Uygur Autonomous Region, China;2. School of Computer Science and Technology, Soochow University, Suzhou 215008, Jiangsu Province, China;3. Academy of Future Education, Xi’an Jiaotong-Liverpool University, Suzhou 215000, Jiangsu Province, China
  • Received:2024-01-25 Online:2025-05-26 Published:2025-05-26

Abstract: Aiming at the problems of poor learning performance and high fluctuation in the deep deterministic policy gradient (DDPG) algorithm for tasks with some large state spaces, we proposed a multi-actor deep deterministic policy gradient algorithm based on progressive k-means clustering (MDDPG-PK-Means) algorithm. In the training process, when selecting actions for the state at each time step, the decision-making of the actor network was  assisted based on the discrimination results of  the k-means clustering algorithm. At the same time, as the training steps increased, the number of k-means cluster centers gradually increased. The MDDPG-PK-Means algorithm was applied to the MuJoCo simulation platform, the experimental results show that, compared with 
 DDPG and other algorithms, the MDDPG-PK-Means algorithm has better performance  in most continuous tasks.

Key words: deep reinforcement learning, deterministic policy , gradient algorithm, k-means clustering, multi-actor

CLC Number: 

  • TP18