Journal of Jilin University Science Edition

Previous Articles     Next Articles

A New Spam Feature Selection Algorithm Based on Information Gain

LI Meng, LIU Yuanning   

  1. College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Received:2016-06-24 Online:2017-03-26 Published:2017-03-24
  • Contact: LIU Yuanning E-mail:lyn@jlu.edu.cn

Abstract: The concept of intraclass dispersity and interclass concentration was proposed based on the traditional information gain feature se lection algorithm. Combined with the traditional information gain algorithm, i t solved the problem of performance degradation caused by ignoring the distribut ion of the characteristic items and improved the efficiency of the information g ain algorithm. The improved feature selection algorithm was applied to the spam filtering experiment. Compared with the traditional feature selection algorithms under different classifiers, the experimental results show that the improved fe ature selection algorithm has better performance.

Key words: information gain, spam, intraclass dispersity, feature selection, interclass concentration

CLC Number: 

  • TP181