J4

• 计算机科学 • Previous Articles     Next Articles

A Heuristic Information Extraction Algorithm

WU Fenfen12, LIU Lei1, XIAO Xian1   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, China;2. Henan Communications Polytechnic, Zhengzhou 450005, China
  • Received:2006-02-15 Revised:1900-01-01 Online:2007-01-26 Published:2007-01-26
  • Contact: LIU Lei

Abstract: A heuristic information extraction algorithm is presented and an information extraction system is built with it. The system utilizes the semanteme characteristic and structure characteristic of the text to make the states with certain characteristics. On the basis of this result, we carried out extracting the remainder states having no characteristic with a algorithm incorporating backwarddynamicprogramming with forwardA* algorithm. We have tested 100 pieces of headers of computer science papers provided by the searchengine research group from CMU university of USA. The result shows the recall and the precision rate are all improved a lot compared with existing methods which are based on words and traditional Viterbi algorithm. In condusion, the heuristic algorithm is better on performance than Viterbi algorithm.

Key words: heuristic algorithm, text block, A* algorithm

CLC Number: 

  • TP391