Journal of Jilin University (Information Science Edition) ›› 2026, Vol. 44 ›› Issue (1): 178-184.
Previous Articles Next Articles
FAN Zhou
Received:
Online:
Published:
Abstract: Considering the characteristics and complexity of multidimensional data streams, in order to fully utilize parallel computing resources and ensure the scalability of the algorithm, a parallel mining algorithm for frequent patterns of multidimensional data streams in Hadoop environment is proposed. Design a Hadoop data stream processing platform based on HDFS ( Hadoop Distributed File System) and MapReduce, propose an HpFitStream clustering algorithm based on feature projection and fitting, using the polynomial fitting algorithm to handle abnormal data streams, and reducing the dimensionality of the processed data streams through feature projection to reduce computational costs. Implement frequent pattern parallel mining of multidimensional data streams in Hadoop environment using PFPonCanTree algorithm. The experimental results show that the proposed method can effectively reduce computational complexity while improving the scalability and load balancing ability of the algorithm.
Key words: Hadoop, MapReduce model, feature projection, polynomial fitting, frequent mode, parallel mining
CLC Number:
FAN Zhou. Parallel Mining Algorithm for Frequent Patterns in Multidimensional Data Streams in Hadoop Environment[J].Journal of Jilin University (Information Science Edition), 2026, 44(1): 178-184.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://xuebao.jlu.edu.cn/xxb/EN/
http://xuebao.jlu.edu.cn/xxb/EN/Y2026/V44/I1/178
Cited