吉林大学学报(理学版) ›› 2025, Vol. 63 ›› Issue (6): 1663-1672.

• • 上一篇    下一篇

基于启发式交叉策略优化的K-Means聚类算法

张立娜1, 张兴瑞1, 马丽2, 于合龙1, 宋欣怡1   

  1. 1. 吉林农业大学 信息技术学院, 长春 130118; 2. 无锡学院 物联网工程学院, 江苏 无锡 214105
  • 收稿日期:2025-02-26 出版日期:2025-11-26 发布日期:2025-11-26
  • 通讯作者: 于合龙 E-mail:yuhelong@jlau.edu.cn

K-Means Clustering Algorithm Based on Heuristic Crossover Strategy Optimization

ZHANG Lina1, ZHANG Xingrui1, MA Li2, YU Helong1, SONG Xinyi1   

  1. 1. College of Information Technology, Jilin Agricultural University, Changchun 130118, China;
    2. College of Internet of Things Engineering, Wuxi University, Wuxi 214105, Jiangsu Province, China
  • Received:2025-02-26 Online:2025-11-26 Published:2025-11-26

摘要: 针对传统K-Means算法对初始质心敏感、 易陷入局部最优以及未能充分挖掘聚类结果潜在语义特征的问题, 提出一种基于启发式交叉策略优化的K-Means聚类算法. 首先, 该算法通过密度驱动的启发式交叉初始化策略, 筛选高密度区域的代表性父代点, 并引入交叉系数动态生成多样性初始质心, 以降低随机初始化导致的聚类结果波动性; 其次, 在聚类迭代过程中, 结合父代点信息与簇内均值更新规则, 通过交叉操作动态调整质心位置, 解决了传统算法因局部最优导致的簇间重叠问题; 最后, 将优化后的聚类结果输入多层感知机, 利用其非线性映射能力挖掘潜在特征, 实现了聚类结果与深层语义特征的深度融合. 实验结果表明, 该算法的轮廓系数、 Davies-Bouldin指数和调整Rand指数分别达0.634,1.398,0.621, 显著优于其他改进算法, 有效提升了算法的聚类准确性、 稳定性和可解释性.

关键词: 启发式交叉策略, K-Means聚类算法, 多层感知机, 特征融合

Abstract: Aiming at  the problems that the traditional K-Means algorithm was sensitive to initial centroids, prone to local optima, and failing to fully mine the potential semantic features of clustering results, we proposed a  K-Means clustering algorithm based on heuristic crossover strategy optimization. Firstly, the algorithm used  a density-driven heuristic crossover initialization strategy to screen representative parent points in high-density regions, and  introduced a crossover coefficient to dynamically generate diverse initial centroids to  reduce the volatility of clustering results caused by random initialization. Secondly, during the clustering iteration process, by combining the information of parent points with the intra-cluster mean update rule, the centroid positions were dynamically adjusted through crossover operations, which solved the problem of inter-cluster overlap caused by the local optima of the traditional algorithm. Finally, the optimized clustering results were input into a multi-layer perceptron, which utilized its nonlinear mapping ability to mine potential features and  achieved  deep fusion of clustering results with  deep semantic features. Experimental results show that the contour coefficient, Davies-Bouldin index, and adjusted Rand index of the algorithm reach 0.634, 1.398 and 0.621, respectively, which are significantly superior to other improved algorithms, effectively improving clustering accuracy, stability, and interpretability of the algorithm.

Key words: heuristic crossover strategy, K-Means clustering algorithm, multi-layer perceptron, feature fusion

中图分类号: 

  • TP399