J4

• 计算机 • 上一篇    下一篇

基于数据模式聚类算法的离群点检测

李永丽1,2, 任辉明3, 董立岩1, 李威1, 陈思国1, 赵宇2   

  1. 1. 吉林大学 计算机科学与技术学院, 长春 130012; 2. 东北师范大学 计算机学院, 长春 130024; 3. 中国移动通信集团吉林有限公司 新业务支撑中心, 长春 130061
  • 收稿日期:2006-10-31 修回日期:1900-01-01 出版日期:2007-05-26 发布日期:2007-05-26
  • 通讯作者: 李永丽

Outlier Testing Method Based on Pattern Clustering Algorithm

LI Yongli1,2, REN Huiming3, DONG Liyan1, LI Wei1, CHEN Siguo1, ZHAO Yu2   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, China; 2. School of Computer Science, Northeast Normal University, Changchun 130024, China; 3. China Mobile Communication Group Jilin Corporation, New Service Support Center, Changchun 130061, China
  • Received:2006-10-31 Revised:1900-01-01 Online:2007-05-26 Published:2007-05-26
  • Contact: LI Yongli

摘要: 针对传统模式挖掘算法在事务包含模式定义上未考虑模式间的包含关系而使聚类结果不够优良的问题, 提出一种新的基于模式聚类的离群点检测算法PCOT, 该算法适合于高维数据空间, 采用一种新的事务包含模式, 通过将模式表示成超图, 用超图分割方法对模式进行聚类. 实验与分析结果表明, 该算法能有效地在高维稀疏空间中发现离群点.

关键词: 数据挖掘, 离群点, 聚类, 超图分割

Abstract: Traditional mining algorithm does not contain relations which is defined in pattern. The clustering method based on traditional data pattern brings in different businesses together. Thus the result of the cluster is not good enough. In this paper, a new algorithm called PCOT (patternbased clustering outlier test) is presented. PCOT is suitable in highdimensional space, which uses a new business containing pattern. In the algorithm, a novel hypergraph model is proposed to represent the relations among the patterns. Hypergraph partitioning method is used in clusering. Experiment shows that this approach can find the outliers in highdimensionalsparse space effectively.

Key words: data mining, outlier, clustering, hypergraph partitioning

中图分类号: 

  • TP18