Journal of Jilin University (Information Science Edition) ›› 2025, Vol. 43 ›› Issue (5): 1144-1150.
Previous Articles Next Articles
QI Weiru, BI Peng
Received:
Online:
Published:
Abstract: The sources of regional economic data are diverse, including statistical departments, enterprise reports, sensor data, et al. There are significant differences in data format, structure, and semantics, making it difficult to process them uniformly. This leads to difficulties in accurately extracting data features, which in turn results in inaccurate data classification results for methods. To address this issue, a regional economic big data integrated classification method based on parallel clustering algorithm is proposed. Based on the characteristics of regional economic big data, calculate the purity and neighborhood radius of the data, determine the missing values of regional economic big data, and correct and fill them in. Based on the filled data, parallel clustering algorithm is used to randomly divide it into multiple subsets of data. The parallel clustering algorithm utilizes multi node parallel processing to significantly improve computational efficiency and meet the requirements of large-scale data processing. Extract the feature quantities of each data subset and design a big data base classifier accordingly. Under the premise of considering the internal data density of the base classifiers, determine the weight values of each base classifier, combine the classification results of each base classifier, and output the final data ensemble classification result. The experimental results show that the designed classification method has a DBI (Davies-Bouldin Index) index of 0.31 in practical applications, which can achieve accurate classification of regional economic big data.
Key words: parallel clustering algorithm, regional economic big data, big data integration, big data classification, base classifier, correction filling
CLC Number:
QI Weiru, BI Peng. Integrated Classification Method for Regional Economic Big Data Based on Parallel Clustering Algorithm[J].Journal of Jilin University (Information Science Edition), 2025, 43(5): 1144-1150.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://xuebao.jlu.edu.cn/xxb/EN/
http://xuebao.jlu.edu.cn/xxb/EN/Y2025/V43/I5/1144
Cited