吉林大学学报(理学版)

• 计算机科学 • 上一篇    下一篇

基于清晰有理数均值的新匹配聚类算法

尚靖博1, 左万利2   

  1. 1. 吉林大学 软件学院, 长春 130012; 2. 吉林大学 计算机科学与技术学院, 长春 130012
  • 收稿日期:2016-11-29 出版日期:2018-03-26 发布日期:2018-03-27
  • 通讯作者: 左万利 E-mail:zuowl@jlu.edu.cn

A Novel Matching Clustering Algorithm Based onClear Rational Number Mean

SHANG Jingbo1, ZUO Wanli2   

  1. 1. College of Software, Jilin University, Changchun 130012, China;2. College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Received:2016-11-29 Online:2018-03-26 Published:2018-03-27
  • Contact: ZUO Wanli E-mail:zuowl@jlu.edu.cn

摘要: 通过改进清晰有理数均值的方法, 提出一种新匹配聚类算法. 首先计算每条数据的清晰有理数均值, 然后与匹配项比较, 得出聚类结果, 解决了人工标注型数据的聚类问题. 将该方法在反欺诈网页领域中进行了检测和验证, 并与使用同一名称但不同类型数据集的K最近邻算法进行比较, 实验结果表明, 该方法在反欺诈网页领域中比K最近邻算法更有效, 同时也证明了新匹配聚类算法在人工标注型数据上聚类具有合理性.

关键词: 聚类, 数据挖掘, 清晰有理数均值, 匹配

Abstract: We proposed a novel matching clustering algorithm by improving the method of clear rational number mean. First, we calculated the clear rational mean of each piece of data, then compared it with the matching item to get the clustering result, and solved the clustering problem of artificial annotation data. This method was tested and verified in the field of antifraudulent Web pages, and compared with the results of Knearest neighbor algorithm using the same name but different types of data sets. Experimental results show that the method is more effective than Knearest neighbor algorithm in the field of antifraudulent Web pages, and it also proves that the novel matching clustering algorithm is reasonable in clustering artificial annotation data.

Key words: clustering, matching, clear rational number mean, data mining

中图分类号: 

  • TP391