吉林大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (4): 1216-1221.doi: 10.13229/j.cnki.jdxbgxb201604030

• Orginal Article • Previous Articles     Next Articles

Frequent itemsets mining algorithm based on sort tree

WANG Hong-mei1, 2, DANG Yuan-yuan1, HU Ming1, LIU Da-you2, 3   

  1. 1.School of Computer Science and Engineering, Changchun University of Technology, Changchun 130012,China;
    2.College of Computer Science and Technology, Jilin University, Changchun 130012,China;
    3.Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education,Jilin University,Changchun 130012, China
  • Received:2015-05-05 Online:2016-07-20 Published:2016-07-20

Abstract: In this paper, the concept of sort tree is proposed, and the sort tree is used to store frequent itemsets, which proves the property of the last item pruning. Joining and pruning operations a re implemented as Apriori algorithm with O(1) time complexity. Ancestor-brother express is used to store the sort tree. If an ancestor does not exist in a transaction, all the brother nodes with the same ancestor will be skipped. So the time performance of counting support can be improved. Theoretical analysis and experimental results show that the proposed algorithm can improve the time performance greatly compared with Apriori algorithm.

Key words: artifical intelligence, frequent itemsets, last term pruning, sort tree, ancestor-brother express

CLC Number: 

  • TP311
[1] Schonberger V M, Cukier K. Big Data[M]. Eamon Dolan/Houghton Mifflin Harcourt,2013.
[2] Agrawal R, Srikant R. Fast algorithms for mining association rules[C]∥Proc of the Int'l Conf on Very Large Data Bases,Santiago, Chile, 1994:487-499.
[3] Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation[C]∥Proc of the Int'l Conf on Management of Data, Dallas, TX, 2000: 1-12.
[4] Park J S, Chen M S, Yu P S. An effective hash-based algorithm for mining association rules[C]∥Proc of the Int'l Conf on Management of data,San Jose,1995:175-186.
[5] Brin S, Motwani R, Ullman J D, et al. Dynamic itemset counting and implication rules for market basket analysis[C]∥Proc of the Int'l Conf on Management of Data, Tucson, AZ, 1997:255-264.
[6] Geerts F, Goethals B, Bussche J. A tight upper bound on the number of candidate patterns[C]∥Proc of the Int'l Conf on Data mining, San Jose, CA,2001: 155-162.
[7] 李雄飞, 苑森淼, 董立岩, 等. 多段支持度数据挖掘算法研究[J]. 计算机学报,2001,24(6): 661-665.
Li Xiong-fei, Yuan Sen-miao, Dong Li-yan, et al. A data mining algorithm based on calculating multi-segment support[J]. Chinese Journal of Computers, 2001,24(6):661-665.
[8] 王红梅, 胡明. 基于散列的频繁项集分组算法[J]. 计算机应用, 2013, 33(11): 47-51.
Wang Hong-mei, Hu Ming. Frequent itemsets grouping algorithm based on hash[J]. Journal of Computer Applications, 2013, 33(11): 47-51.
[9] Pei J, Han J, Lakshmanan L V S. Mining frequent itemsets with convertible constraints[C]∥Proc of the Int'l Conf on Data Engineering, Heidelberg, Germany, 2001:324-332.
[10] Liu J,Pan Y,Wang K,et al.Mining frequent item sets by opportunistic projection[C]∥Proc of the Int'l Conf on Knowledge Discovery in Databases,Edmonton, Canada, 2002: 23-32.
[11] 颜跃进,李舟军,陈火旺. 基于FP-Tree有效挖掘最大频繁项集[J]. 软件学报, 2005, 16(2): 215-222.
Yan Yue-jin, Li Zhou-jun, Chen Huo-wang. Efficiently mining of maximal frequent item sets based on FP-tree[J]. Journal of Software, 2005, 16(2):215-222.
[12] 杨君锐,杨莉. 分布式全局最大频繁项集更新挖掘算法[J]. 华中科技大学学报:自然科学版, 2011, 39(12):85-88.
Yang Jun-rui, Yang Li. Algorithm of updating mining for distributed global maximal frequent itemsets[J]. Journal of Huazhong University of Science and Technology(Natural Science Edition), 2011, 39(12):85-88.
[13] 毛宇星,陈彤兵,施伯乐. 一种高效的多层和概化关联规则挖掘方法[J]. 软件学报, 2010, 22(12): 2965-2980.
Mao Yu-xing, Chen Tong-bing, Shi Bo-le. Efficient method for mining multiple-level and generalized association rules[J]. Journal of Software, 2010, 22(12):2965-2980.
[14] Han J, Cheng H, Xin D, et al. Frequent pattern mining: current status and future directions[J]. Data Mining and Knowledge Discovery, 2007, 15(1): 55-86.
[1] MA Jian, FAN Jian-ping, LIU Feng, LI Hong-hui. The evolution model of objective-oriented software system [J]. 吉林大学学报(工学版), 2018, 48(2): 545-550.
[2] WANG Li-yu, CHEN Lan, HAO Xiao-ran, WANG Qiang, NI Mao. Life-aware buffer management algorithm for flash-based databases [J]. 吉林大学学报(工学版), 2017, 47(2): 632-638.
[3] YING Huan, WANG Dong-hui, WU Cheng-gang, WANG Zhe, TANG Bo-wen, LI Jian-jun. Efficient deterministic replay technique on commodity system environment [J]. 吉林大学学报(工学版), 2017, 47(1): 208-217.
[4] LI Yong, HUANG Zhi-qiu, WANG Yong, FANG Bing-wu. New approach of cross-project defect prediction based on multi-source data [J]. 吉林大学学报(工学版), 2016, 46(6): 2034-2041.
[5] WANG Nian-bin, ZHU Guan-wen, ZHOU Lian-ke, WANG Hong-wei. Novel dataspace index for efficient processing of path query [J]. 吉林大学学报(工学版), 2016, 46(3): 911-916.
[6] WANG Ke-chao, WANG Tian-tian, SU Xiao-hong, MA Pei-jun. Plagiarism detection in student programs based on frequent closed sequence mining [J]. 吉林大学学报(工学版), 2015, 45(4): 1260-1265.
[7] HUANG Hong-tao,WANG Jing,YE Hai-zhi,HUANG Shao-bin. Lazy slicing based method for verifying linear temporal logic property [J]. 吉林大学学报(工学版), 2015, 45(1): 245-251.
[8] ZHANG Qing-feng,XU Jing,LI Shan-shan. Interaction-aware parallel query scheduling strategy [J]. 吉林大学学报(工学版), 2015, 45(1): 252-260.
[9] FAN Da-juan, HUANG Zhi-qiu, XIAO Fang-xiong, ZHU Yi, WANG Jin. Compatibility analysis and adaptor generation for multi-service interaction [J]. 吉林大学学报(工学版), 2014, 44(4): 1094-1103.
[10] LIU Guo-qi, LIU Hui, GAO Yu, LIU Ying, ZHU Zhi-liang. Resource dynamic pricing strategy based on utility in cloud computing [J]. 吉林大学学报(工学版), 2013, 43(06): 1631-1637.
[11] TONG Jin, WANG Ya-hui, FAN Xue-mei, ZHANG Shu-jun, CHEN Dong-hui. Monitoring system of cold chain logistics for farm fresh produce [J]. 吉林大学学报(工学版), 2013, 43(06): 1707-1711.
[12] ZHANG Li-jun, LI Zhan-huai, CHEN Qun, LOU Ying, LI Ning. Classifying XML documents based on term semantics [J]. , 2012, (06): 1510-1514.
[13] WANG Zhi-jian, HU Yu-ping, CHEN Zhang. Describing component based software system architecture using Petri net [J]. 吉林大学学报(工学版), 2012, 42(增刊1): 304-308.
[14] WU Xiao-xuan, NI Zhi-wei, NI Li-ping. Clustering ensembles algorithm based on fractal dimension [J]. 吉林大学学报(工学版), 2012, 42(增刊1): 364-367.
[15] BAI Hong-tao, OU YANG Dan-tong, LI Xi-ming, HE Li-li. Multiple ant colonies sharing common pheromone matrix based on CPU [J]. 吉林大学学报(工学版), 2011, 41(6): 1678-1683.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LIU Song-shan, WANG Qing-nian, WANG Wei-hua, LIN Xin. Influence of inertial mass on damping and amplitude-frequency characteristic of regenerative suspension[J]. 吉林大学学报(工学版), 2013, 43(03): 557 -563 .
[2] CHU Liang, WANG Yan-bo, QI Fu-wei, ZHANG Yong-sheng. Control method of inlet valves for brake pressure fine regulation[J]. 吉林大学学报(工学版), 2013, 43(03): 564 -570 .
[3] LI Jing, WANG Zi-han, YU Chun-xian, HAN Zuo-yue, SUN Bo-hua. Design of control system to follow vehicle state with HIL test beach[J]. 吉林大学学报(工学版), 2013, 43(03): 577 -583 .
[4] HU Xing-jun, LI Teng-fei, WANG Jing-yu, YANG Bo, GUO Peng, LIAO Lei. Numerical simulation of the influence of rear-end panels on the wake flow field of a heavy-duty truck[J]. 吉林大学学报(工学版), 2013, 43(03): 595 -601 .
[5] WANG Tong-jian, CHEN Jin-shi, ZHAO Feng, ZHAO Qing-bo, LIU Xin-hui, YUAN Hua-shan. Mechanical-hydraulic co-simulation and experiment of full hydraulic steering systems[J]. 吉林大学学报(工学版), 2013, 43(03): 607 -612 .
[6] ZHANG Chun-qin, JIANG Gui-yan, WU Zheng-yan. Factors influencing motor vehicle travel departure time choice behavior[J]. 吉林大学学报(工学版), 2013, 43(03): 626 -632 .
[7] MA Wan-jing, XIE Han-zhou. Integrated control of main-signal and pre-signal on approach of intersection with double stop line[J]. 吉林大学学报(工学版), 2013, 43(03): 633 -639 .
[8] YU De-xin, TONG Qian, YANG Zhao-sheng, GAO Peng. Forecast model of emergency traffic evacuation time under major disaster[J]. 吉林大学学报(工学版), 2013, 43(03): 654 -658 .
[9] XIAO Yun, LEI Jun-qing, ZHANG Kun, LI Zhong-san. Fatigue stiffness degradation of prestressed concrete beam under multilevel amplitude cycle loading[J]. 吉林大学学报(工学版), 2013, 43(03): 665 -670 .
[10] XIAO Rui, DENG Zong-cai, LAN Ming-zhang, SHEN Chen-liang. Experiment research on proportions of reactive powder concrete without silica fume[J]. 吉林大学学报(工学版), 2013, 43(03): 671 -676 .