基于清晰有理数均值的新匹配聚类算法

吉林大学学报(理学版)

基于清晰有理数均值的新匹配聚类算法

尚靖博¹, 左万利²

1. 吉林大学软件学院, 长春 130012； 2. 吉林大学计算机科学与技术学院, 长春 130012

收稿日期:2016-11-29 出版日期:2018-03-26 发布日期:2018-03-27
通讯作者: 左万利 E-mail:zuowl@jlu.edu.cn

A Novel Matching Clustering Algorithm Based onClear Rational Number Mean

SHANG Jingbo¹, ZUO Wanli²

1. College of Software, Jilin University, Changchun 130012, China;2. College of Computer Science and Technology, Jilin University, Changchun 130012, China

Received:2016-11-29 Online:2018-03-26 Published:2018-03-27
Contact: ZUO Wanli E-mail:zuowl@jlu.edu.cn

摘要/Abstract

摘要： 通过改进清晰有理数均值的方法, 提出一种新匹配聚类算法. 首先计算每条数据的清晰有理数均值, 然后与匹配项比较, 得出聚类结果, 解决了人工标注型数据的聚类问题. 将该方法在反欺诈网页领域中进行了检测和验证, 并与使用同一名称但不同类型数据集的K最近邻算法进行比较, 实验结果表明, 该方法在反欺诈网页领域中比K最近邻算法更有效, 同时也证明了新匹配聚类算法在人工标注型数据上聚类具有合理性.

关键词: 聚类, 数据挖掘, 清晰有理数均值, 匹配

Abstract: We proposed a novel matching clustering algorithm by improving the method of clear rational number mean. First, we calculated the clear rational mean of each piece of data, then compared it with the matching item to get the clustering result, and solved the clustering problem of artificial annotation data. This method was tested and verified in the field of antifraudulent Web pages, and compared with the results of Knearest neighbor algorithm using the same name but different types of data sets. Experimental results show that the method is more effective than Knearest neighbor algorithm in the field of antifraudulent Web pages, and it also proves that the novel matching clustering algorithm is reasonable in clustering artificial annotation data.

Key words: clustering, matching, clear rational number mean, data mining

中图分类号:

TP391

尚靖博, 左万利. 基于清晰有理数均值的新匹配聚类算法[J]. 吉林大学学报(理学版), 2018, 56(2): 399-401.

SHANG Jingbo, ZUO Wanli. A Novel Matching Clustering Algorithm Based onClear Rational Number Mean[J]. Journal of Jilin University Science Edition, 2018, 56(2): 399-401.

[1]	张震, 张照崎, 朱留存, 刘济尘, 魏金占, 蔡旭航, 赵成龙. 一种基于Shi-Tomasi和改进LBP的特征匹配及目标定位快速算法[J]. 吉林大学学报(理学版), 2021, 59(5): 1171-1178.
[2]	张蕾, 姜宇, 孙莉. 一种改进型TF-IDF文本聚类方法[J]. 吉林大学学报(理学版), 2021, 59(5): 1199-1204.
[3]	胡雅婷, 陈营华, 宝音巴特, 曲福恒, 李卓识. 一种增量式MinMax k-Means聚类算法[J]. 吉林大学学报(理学版), 2021, 59(5): 1205-1211.
[4]	聂逯松, 常方圆, 常学智, 刘畅, 金有为, 刘国晟, 付加胜, 韩霄松. 一种新型的自适应多核学习算法[J]. 吉林大学学报(理学版), 2021, 59(5): 1212-1218.
[5]	蒲晓川, 黄俊丽, 祁宁, 宋长松. 基于密度信息熵的K-means算法在客户细分中的应用[J]. 吉林大学学报(理学版), 2021, 59(5): 1245-1251.
[6]	曾宏志, 史洪松. 半监督技术和主动学习相结合的网络入侵检测方法[J]. 吉林大学学报(理学版), 2021, 59(4): 936-942.
[7]	张震, 张照崎, 朱留存, 苗志滨, 王骥月, 李修明, 赵成龙, 张坤伦. 基于Harris-改进LBP的特征匹配及目标定位算法[J]. 吉林大学学报(理学版), 2021, 59(3): 568-576.
[8]	张震, 张照崎, 苗志滨, 朱留存, 李修明, 麦冬, 张坤伦, 周瑞凯. 基于Harris-Hist的特征匹配及目标定位算法[J]. 吉林大学学报(理学版), 2021, 59(2): 333-341.
[9]	李宏, 王鹏, 毕波, 唐锦萍. 基于改进PSO-SIFT算法的油田遥感图像匹配[J]. 吉林大学学报(理学版), 2021, 59(2): 342-350.
[10]	张琪, 左平, 郝永乐, 杨程博, 李婷婷. 美式多资产期权定价问题的有限差分法[J]. 吉林大学学报(理学版), 2020, 58(5): 1113-1118.
[11]	唐保祥, 任韩. 图2-2nP₅和2-nK_1,1,1,3完美匹配的计数[J]. 吉林大学学报(理学版), 2020, 58(4): 859-863.
[12]	谭翔纬. 基于支持向量机和用户反馈的图像检索算法[J]. 吉林大学学报(理学版), 2020, 58(4): 899-905.
[13]	李健, 姜楠, 宝音巴特, 张帆, 张伟健, 王薇. 空间颜色聚类算法及其在图像特征提取中的应用[J]. 吉林大学学报(理学版), 2020, 58(3): 627-633.
[14]	王海燕, 崔文超, 许佩迪, 李闯. Canopy在划分聚类算法中对K选取的优化[J]. 吉林大学学报(理学版), 2020, 58(3): 634-638.
[15]	唐保祥, 任韩. 2类图完美匹配数按匹配顶点分类的递推求法[J]. 吉林大学学报(理学版), 2020, 58(2): 309-313.