J4 ›› 2010, Vol. 48 ›› Issue (03): 427-432.

• 计算机科学 • 上一篇    下一篇

基于短语匹配的中文分词消歧方法

姚继伟, 赵东范   

  1. 吉林大学 计算机科学与技术学院, 长春 130012
  • 收稿日期:2009-06-25 出版日期:2010-05-26 发布日期:2010-05-19
  • 通讯作者: 赵东范 E-mail:zdf@jlu.edu.cn

Disambiguation Method in Chinese Word SegmentationBased on Phrase Match

YAO Jiwei, ZHAO Dongfan   

  1. College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Received:2009-06-25 Online:2010-05-26 Published:2010-05-19
  • Contact: ZHAO Dongfan E-mail:zdf@jlu.edu.cn

摘要:

在短语结构文法的基础上, 提出一种基于局部单一短语匹配和语义规则相结合的消歧方法. 通过增加短语间的右嵌套规则和采用有限自动机的实现方式, 解决了短语规则中存在冗余项的问题, 提高了短语匹配效率和歧义消除类型的针对性. 实验结果表明, 该消歧方法的平均消歧率约为98%, 优于一般未考虑词语间语法和语义消歧模型的处理效果.

关键词: 中文分词; 短语匹配; 歧义字段; 消歧方法

Abstract:

The authors presented a word disambiguation method based on local single phrase match and semantic rules via exploring the phrasestructured grammar. The method can solve the problem of redundant items existed in phrase rules and can improve the matching efficiency and disambiguation pertinence of the kinds of ambiguity by adding right nested rules between phrases and applying the matching mode of finiteautomaton. The experiment proves that the average accuracy of disambiguation reaches about 98% which is better than those of the other methods without considering the syntax and sematic between words.

Key words:  Chinese word segmentation; phrase match; ambiguous word; disambiguation method

中图分类号: 

  • TP391