Journal of Jilin University Science Edition

Previous Articles     Next Articles

Hybrid Variable Selection Algorithm Based onMutual Information and Random Forest

ZHAO Weiwei1, LI Yanying2, ZHAO Fengqin1, WEI Sasa1   

  1. 1. School of Mathematics and Statistics, Xidian University, Xi’an 710126, China; 2. School of Mathematicsand Information Science, Baoji University of Arts and Science, Baoji 721013, Shaanxi Province, China
  • Received:2016-07-18 Online:2017-07-26 Published:2017-07-13
  • Contact: ZHAO Weiwei E-mail:zhaoweiweitg@163.com

Abstract: Aiming at the problem that the classification accuracy and generalization ability of model were low in single variable selection algorithms, we proposed a hybrid variable selection algorithm. The algorithm was divided into two stages. In filtration stage, mutual information was used to quickly exclude a part of irrelevant variables, which reduced the dimension of sample space. In wrapper stage, the random forest was used to refine the remaining variables in the framework of permutation theory. The experimental results show that, compared with the contrast algorithm, this algorithm has higher classification accuracy and generalization ability.

Key words: hybrid algorithm, random forest, variable selection, mutual information

CLC Number: 

  • TP391