吉林大学学报(理学版) ›› 2019, Vol. 57 ›› Issue (1): 111-120.

• 计算机科学 • 上一篇    下一篇

基于K近邻和多类合并的密度峰值聚类算法

薛小娜1, 高淑萍1, 彭弘铭2, 吴会会1   

  1. 1. 西安电子科技大学 数学与统计学院, 西安 710071; 
    2. 西安电子科技大学 通信工程学院, 西安 710071
  • 收稿日期:2017-12-16 出版日期:2019-01-26 发布日期:2019-02-08
  • 通讯作者: 薛小娜 E-mail:xiaona_xue@163.com

Density Peaks Clustering Algorithm Based onKNearest Neighbors and ClassesMerging#br#

XUE Xiaona1, GAO Shuping1, PENG Hongming2, WU Huihui1   

  1. 1. School of Mathematics and Statistics, Xidian University, Xi’n 710071, China;
    2. School of Telecommunications Engineering, Xidian University, Xi’an 710071, China
  • Received:2017-12-16 Online:2019-01-26 Published:2019-02-08
  • Contact: XUE Xiaona E-mail:xiaona_xue@163.com

摘要: 针对密度峰值聚类(DPC)算法在处理结构复杂、 维数较高以及同类中存在多个密度峰值的数据集时聚类性能不佳的问题, 提出一种基于K近邻和多类合并的密度峰值聚类(KM-DPC)算法. 首先利用定义的密度计算方法描述样本分布, 采用新的评价指标获取聚类中心; 然后结合K近邻思想设计迭代分配策略, 将剩余点准确归类; 最后给出一种局部类合并方法, 以防将包含多个密度峰值点的类分裂. 仿真实验结果表明, 该算法在22个不同数据集上的性能明显优于DPC算法.

关键词: 聚类, 局部密度, 密度峰值, K近邻, 多类合并

Abstract: Aiming at the problem that the density peaks clustering (DPC) algorithm had poor clustering performance in dealing with data with complex structure, high dimensionality and multiple density peaks in the same class, we proposed a density peaks clustering algorithm based on Knearest neighbors and classesmerging (KMDPC). Firstly, the sample distribution was described by the defined density calculation method, and the clustering center was obtained by using new evalution index. Secondly, an iterative assignment strategy based on the idea of Knearest neighbors was designed to classify the remaining data points accurately. Finally, a local merging method was presented to prevent the splitting of classes with mu
ltiple density peaks. Simulation results show that the performance of this algorithm is obviously better than that of DPC algorithm on 22 different datasets.

Key words: clustering, local density, density peak, K-nearest , neighbor, classesmerging

中图分类号: 

  • TP181