J4 ›› 2012, Vol. 30 ›› Issue (5): 544-.

Previous Articles     Next Articles

Improved Feature Selection Method

GUO Xiao-dong1, JIANG Yu-ming2, FEI Fei3   

  1. 1. Changchun Engineering Consulting Service Center, Changchun 130042, China;2. School of Physics and Engineering, SUN YAT-SEN University, Zhongsh
    an 510006, China;3. School of Electronic Information and Electrical Engineering, Shanghai Jiaotong University, Shanghai 200240, China
  • Online:2012-09-28 Published:2012-11-01

Abstract:

Marginal probability has a greater effect on traditional mutual information feature selection method, which may leads to evaluation of rare words bigger than common words, resulting in selecting low frequency words. In order to improve these insufficiencies, we analyze several traditional feature extraction methods, associates the mutual information method with characteristics of word frequency by introducing disparity and average frequency, and increases the accuracy of mutual information classification Experiment shows that this method makes better classification results.

Key words: text classification, feature selection, mutual information

CLC Number: 

  • TP37