Journal of Jilin University(Information Science Ed

Previous Articles     Next Articles

Data Mining Parallel Algorithm Based on G4ICCS

LIU Weia,b, LU Lai-junc, WANG Hong-xiaob, CAO Yan-bob   

  1. a. Mineral Prediction Institute, Jilin University, Changchun 130026, China; b. Center for Computer Fundamental Education, Jilin University,Changchun 130012, China; c. College of Earth Sciences, Jilin University, Changchun 130026, China
  • Received:2013-02-28 Online:2013-05-27 Published:2013-06-07

Abstract:

For the traditional decision tree SPRINT(Scalable Parallelizable Induction of Decision Trees) algorithm cannot solve the problem of mass geoscience data mining, the paper designed and realized PSPRINT algorithm. It is a decision tree parallel classification algorithm based on G4ICCS (Geology Geography Geochemistry Geophysics Information Cloud Computing System). The algorithm uses hash table to save data record on both sides of continuous attributes po
intof division, providing basis for the division of parallel node, and solved mass geoscience data mining problem. The experimental results show that the decision t
ree parallel algorithm can deal with the classification problem of mass geoscience data under the simulated environment of cloud computing. And the algorithm has better stability and processing speed.

Key words: geology geography geochemistry geophysics information cloud computing system(G4ICCS), data mining, decision tree algorithm, parallel

CLC Number: 

  • TP312