Journal of Jilin University (Information Science Edition) ›› 2018, Vol. 36 ›› Issue (6): 674-680.

Previous Articles     Next Articles

Research on Improved VSM-HowNet Fusion Similarity Algorithm

XIAO Shang1,FANG Zhiyi2,DONG Hongliang3,ZHAO Shuai2,WANG Hanyu4   

  1. 1. Product Innovation Center,Tianchi Media Company Limited,Beijing 100020,China; 2. College of Computer Science and Technology,Jilin University,Changchun 130012,China; 3. Information and Data Research Lab,Banine Technologies Ltd,Changchun 130012,China; 4. College of Information Science and Technology,Northeast Normal University,Changchun,130017,China
  • Received:2018-06-21 Online:2018-11-23 Published:2019-02-25

Abstract: With the development of society,more and more“Sensational Headline”which does not match the text for the purpose of attracting the audiences attention phenomena has appeared. In order to identify the “Sensational Headline”news,methods of text similarity calculation based on an Improved VSM Combined with Cosine Similarity Method,the HowNet Method and an Improved VSM-HowNet Fusion Similarity Algorithm are introduced. These methods have higher total accuracy,total recall rate and total F1 value for the“Sensational Headline”news recognition than other text similarity calculation methods. For identifying an unknown type of news,the improved VSM-HowNet fusion similarity algorithm is more advantageous than other text similarity calculation methods.

Key words: “Sensational Headline”news, improved VSM combined cosine similarity method, HowNet method, improved VSM-HowNet fusion similarity algorithm

CLC Number: 

  • TP37