Journal of Jilin University (Information Science Edition) ›› 2024, Vol. 42 ›› Issue (5): 959-965.

Previous Articles     Next Articles

 Improved Decision Tree Algorithm for Big Data Classification Optimization 

 TANG Lingyi1, TANG Yiwen2, LI Beibei3   

  1. 1. Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai 200127, China; 2. General Office (Information Management Division), Shanghai Municipal Health Commission, Shanghai 200125, China; 3. Community Health Service Center of Linfen Road, Jing’an District, Shanghai 200435, China
  • Received:2023-05-18 Online:2024-10-21 Published:2024-10-23

Abstract: Due to the complex structure and features of current massive data, big data exhibits unstructured and small sample characteristics, making it difficult to ensure high accuracy and efficiency in its classification. Therefore, a big data classification optimization method is proposed to improve the decision tree algorithm. A fuzzy decision function is constructed to detect sequence features of big data, and these features inputted into a decision tree model to mine and train rules. The decision tree model is improved using grey wolf optimization algorithm. The big data is classified using the improved model, and then a classifier accuracy objective function is established to achieve accurate classification of big data. The experimental results show that the proposed method achieves the highest accuracy in classification results and the lowest false positive case rate, ensuring the overall high throughput of the algorithm and improving its classification efficiency.

Key words: decision tree model, grey wolf optimization algorithm, objective function, big data classification, fuzzy decision function

CLC Number: 

  • TP394