吉林大学学报(工学版) ›› 2024, Vol. 54 ›› Issue (5): 1417-1425.doi: 10.13229/j.cnki.jdxbgxb.20220779
• 计算机科学与技术 • 上一篇
Li LYU1,2(),Mei-zi ZHU1,2,Ping KANG1,2,Long-zhe HAN1,2
摘要:
针对流形数据中密度峰值聚类(DPC)算法的局部密度易找到错误的类簇中心,且分配策略易导致远离类簇中心的剩余样本被错误分配的问题,本文提出二阶K近邻和多簇合并的密度峰值聚类(DPC-SKMM)算法。首先,利用最小二阶K近邻定义局部密度,凸显类簇中心与非类簇中心间的密度差异,从而找到正确的类簇中心;其次,利用K近邻找出样本局部代表点并依此确定核心点,用核心点指导微簇划分;最后,利用最小二阶K近邻及共享近邻定义的微簇间吸引度合并微簇,避免远离类簇中心的样本被错误分配,且微簇合并过程无须迭代。本文将DPC-SKMM算法与IDPC-FA、DPCSA、FNDPC、FKNN-DPC、DPC算法进行对比,实验结果表明,DPC-SKMM算法能有效聚类流形及UCI数据集。
中图分类号:
1 | Djenouri Y, Belhadi A, Fournier-Viger P, et al. Fast and effective cluster-based information retrieval using frequent closed itemsets[J]. Information Sciences, 2018, 453: 154-167. |
2 | Zhao H, He C. Objective cluster analysis in value-based customer segmentation method[C]∥Proceedings of the 2nd International Workshop on Knowledge Discovery and Data Mining. Piscataway NJ: IEEE, 2009: 484-487. |
3 | 刘仲民, 李战明, 李博皓, 等. 基于稀疏矩阵的谱聚类图像分割算法[J]. 吉林大学学报: 工学版, 2017, 47(4): 1308-1313. |
Liu Zhong-min, Li Zhan-ming, Li Bo-hao, et al. Spectral clustering image segmentation based on sparse matrix[J]. Journal of Jilin University (Engineering and Technology Edition), 2017, 47(4): 1308- 1313. | |
4 | 赵嘉, 姚占峰, 吕莉, 等.基于相互邻近度的密度峰值聚类算法[J]. 控制与决策, 2021, 36(3): 543-552. |
Zhao Jia, Yao Zhan-feng, Li Lyu, et al. Density peaks clustering based on mutual neighbor degree[J]. Control and Decision, 2021, 36(3): 543-552. | |
5 | 曲福恒, 潘曰涛, 杨勇,等. 基于加权空间划分的高效全局k-means聚类算法[J/OL]. 吉林大学学报: 工学版,[2024-02-28]. |
Qu Fu-heng, Pan Yue-tao, Yang Yong, et al. An efficient global k-means clustering algorithm based on weighted space partitioning[J/OL]. Journal of Jilin University (Engineering and Technology Edition), [2024-02-28]. | |
6 | Liew A W C, Yan H, Yang M. Pattern recognition techniques for the emerging field of bioinformatics: a review[J]. Pattern Recognition, 2005, 38(11): 2055-2073. |
7 | Rodriguez A, Lsio A. Clustering by fast search and find of density peaks[J]. Science, 2014, 344(6191): 1492-1496. |
8 | Wang Y, Yang Y. Relative density-based clustering algorithm for identifying diverse density clusters effectively[J]. Neural Computing and Applications, 2021, 33(16): 10141-10157. |
9 | Li C, Zhang Y. Density peak clustering based on relative density optimization[J]. Mathematical Problems in Engineering, 2020, 2020: No2816102. |
10 | 赵嘉, 王刚, 吕莉, 等. 面向流形数据的测地距离与余弦互逆近邻密度峰值聚类算法[J]. 电子学报, 2022, 50(11): 2730-2737. |
Zhao Jia, Wang Gang, Li Lyu, et al. Density peaks clustering algorithm based on geodesic distance and cosine mutual reverse nearest neighbor for manifold datasets[J]. Acta Electronica Sinica, 2022, 50(11): 2730-2737. | |
11 | Yuan X, Yu H, Liang J, et al. A novel density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy[J]. International Journal of Machine Learning and Cybernetics, 2021, 12(10): 2825-2841. |
12 | 吴润秀, 尹士豪, 赵嘉, 等. 基于相对密度估计和多簇合并的密度峰值聚类算法[J]. 控制与决策, 2023, 38(4): 1047-1055. |
Wu Run-xiu, Yin Shi-hao, Zhao Jia, et al. Density peaks clustering based on relative density estimating and multi cluster merging[J]. Journal of Control and Decision, 2023, 38(4): 1047-1055. | |
13 | 赵嘉, 陈磊, 吴润秀, 等. K近邻和加权相似性的密度峰值聚类算法[J]. 控制理论与应用, 2022, 39(12): 2349-2357. |
Zhao Jia, Chen Lei, Wu Run-xiu, et al. Density peaks clustering based on K-nearest neighbors and weighted similarity[J]. Control Theory & Applications, 2022, 39(12): 2349-2357. | |
14 | Du M, Ding S, Xu X, et al. Density peaks clustering using geodesic distances[J]. International Journal of Machine Learning and Cybernetics, 2018, 9(8): 1335-1349. |
15 | 王大刚, 丁世飞, 钟锦. 基于二阶 k 近邻的密度峰值聚类算法研究[J]. 计算机科学与探索, 2021, 15(8): 1490-1500. |
Wang Da-gang, Ding Shi-fei, Zhong Jin. Research of density peaks clustering algorithm based on second-order k neighbors[J]. Journal of frontiers of computer science and technology, 2021, 15(8): 1490-1500. | |
16 | Cheng D, Zhang S, Huang J. Dense members of local cores-based density peaks clustering algorithm[J]. Knowledge-Based Systems, 2020, 193: 105454 |
17 | 陈蔚昌, 赵嘉, 肖人彬, 等.面向密度分布不均数据的近邻优化密度峰值聚类算法[J].控制与决策, 2024, 39(3): 919-928. |
Chen Wei-chang, Zhao Jia, Xiao Ren-bin, et al. Density peaks clustering algorithm with nearest neighbor optimization for data with uneven density distribution[J]. Control and Decision, 2024, 39(3): 919-928. | |
18 | Zhao J, Tang J, Shi A, et al. Improved density peaks clustering based on firefly algorithm[J]. International Journal of Bio-Inspired Computation, 2020, 15(1): 24-42. |
19 | Sun L, Liu R, Xu J, et al. An adaptive density peaks clustering method with Fisher linear discriminant[J]. IEEE Access, 2019, 7: 72936-72955. |
20 | Du M, Ding S, Xue Y. A robust density peaks clustering algorithm using fuzzy neighborhood[J]. International Journal of Machine Learning and Cybernetics, 2018, 9(7): 1131-1140. |
21 | Cheng D, Zhu Q, Huang J, et al. Natural neighbor-based clustering algorithm with density peeks[C]∥2016 International Joint Conference on Neural Networks, 2016: 92-98. |
22 | Vinh N X, Epps J, BAILEY J. Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance[J]. The Journal of Machine Learning Research, 2010, 11: 2837-2854. |
23 | Fowlkes E B, Mallows C L. A method for comparing two hierarchical clusterings[J]. Journal of the American Statistical Association, 1983, 78(383): 553-569. |
[1] | 翁剑成,魏瑞聪,何寒梅,徐海辉,王晶晶. 基于关联路链组的城市路网短时交通流预测模型[J]. 吉林大学学报(工学版), 2023, 53(11): 3104-3112. |
[2] | 曹倩,李志慧,陶鹏飞,马永建,杨晨曦. 考虑风险异质特性的路网交通事故风险评估方法[J]. 吉林大学学报(工学版), 2023, 53(10): 2817-2825. |
[3] | 魏路,高磊,李晋宏,杨建,田玉林. 基于密度峰值聚类的交通控制子区划分方法[J]. 吉林大学学报(工学版), 2023, 53(1): 124-131. |
[4] | 刘富, 权美静, 王柯, 刘云, 康冰, 韩志武, 侯涛. 仿蝎子振源定位机理的位置指纹室内定位方法[J]. 吉林大学学报(工学版), 2019, 49(6): 2076-2082. |
[5] | 黄岚, 李玉, 王贵参, 王岩. 基于点距离和密度峰值聚类的社区发现方法[J]. 吉林大学学报(工学版), 2016, 46(6): 2042-2051. |
|