Journal of Jilin University (Information Science Edition) ›› 2026, Vol. 44 ›› Issue (3): 625-631.
Previous Articles Next Articles
JIANG Mingze, LI Wei, DONG Dan
Received:
Online:
Published:
Abstract: In multi-source data integration, data may be distributed in different subspaces and have a high degree of data imbalance. In order to improve the efficiency of data analysis, a multi-source data integration processing method based on the Lubang subspace clustering algorithm is proposed. Firstly, by improving the data balancing algorithm, the maximum number of class samples and the average number of class samples are calculated, and the composite minority class oversampling technique is used to obtain a relatively balanced subset, solving the problem of imbalanced data distribution. Then, by using the Dice coefficient similarity measure, the cosine similarity of multi-source data is calculated. By evaluating the similarity between data from different sources, the problem of data heterogeneity and redundancy is solved. Finally, on the basis of establishing self representativeness and establishing affinity graphs to reveal the inherent correlations of data, the Lu Bang subspace clustering algorithm is used to identify the feature subspaces of different data. By introducing a robustness mechanism which can resist the influence of noise and redundant features, the membership degree of the data is calculated, and data integration processing performed based on the membership degree. The experimental results show that this method can achieve integrated processing of multiple source data, improve data analysis efficiency, and ensure data consistency and reliability.
Key words: Robust subspace clustering algorithm, multiple source data, cosine similarity, data integrationprocessing, high dimensional feature space
CLC Number:
JIANG Mingze, LI Wei, DONG Dan. Processing Method of Multisource Data Integration Based on Robust Subspace Clustering Algorithm[J].Journal of Jilin University (Information Science Edition), 2026, 44(3): 625-631.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://xuebao.jlu.edu.cn/xxb/EN/
http://xuebao.jlu.edu.cn/xxb/EN/Y2026/V44/I3/625
Cited