J4

• 计算机科学 • 上一篇    下一篇

一种快速最大频繁序列模式挖掘算法

常晓宇, 王喆, 徐秀娟, 路春一, 周春光   

  1. 吉林大学 计算机科学与技术学院, 长春 130012
  • 收稿日期:2005-08-02 修回日期:1900-01-01 出版日期:2006-07-26 发布日期:2006-07-26
  • 通讯作者: 周春光

A Mining Algorithm for Fast Maximal Sequential Patterns

CHANG Xiaoyu, WANG Zhe, XU Xiujuan, LU Chunyi, ZHOU Chunguang   

  1. College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Received:2005-08-02 Revised:1900-01-01 Online:2006-07-26 Published:2006-07-26
  • Contact: ZHOU Chunguang

摘要: 针对序列模式挖掘中, 频繁子序列个数随模式长度增加而爆炸性增长的问题, 提出一种从序列数据库中挖掘最大频繁序列模式的新算法(MFSPAN). MFSPAN充分利用不同序列可能具有相同前缀的性质来减少项集比较次数. 在标准测试数据集上的实验结果表明了MFSPAN的有效性.

关键词: 序列模式, 最大序列模式, 长模式, 深度优先

Abstract: This paper proposes a novel algorithm MFSPAN (maximal frequent sequential pattern mining algorithm). MFSPAN is used to mine the com plete set of maximal frequent sequential patterns in sequence databases. It solves the problem that the number of frequent subsequences will increase explosively as frequent patterns become longer: because MFSPAN takes full advantage of the property that different sequences may share a common prefix to reduce itemset comparing times. Experiments on standard test data show that MFSPAN is very effective.

Key words: sequential pattern, maximal sequential pattern, long pattern, depthfirst

中图分类号: 

  • TP31