Journal of Jilin University Science Edition

Previous Articles     Next Articles

Extraction of Multiword Expressions in Questions ofQuestion Answering Communities

WU Ruihong1, LV Xueqiang1, LI Zhuo1, SHU Yan2   

  1. 1. Beijing Key Laboratory of Internet Culture and Digital Dissemination Research,Beijing Information Science and Technology University, Beijing 100101, China;2. Beijing TRS Information Technology Co. Ltd., Beijing 100101, China
  • Received:2013-09-09 Online:2014-11-26 Published:2014-12-11
  • Contact: WU Ruihong E-mail:ruihong0417@163.com

Abstract:

The multiword expressions (MWEs) in the questions of question answering communities have direct relationship with question interpretation. We first proposed the idea of extracting MWEs from the questions of question answering communities. According to the characteristics of multiword expressions in the questions, we proposed a method of extracting MWEs in questions of question answering communities. In this method, we first used mutual information method and stop words filtering method to get the candidate MWEs. Then we classified the candidate MWEs into four types: right string, incomplete string, redundancy string and error string. At last, with the help of query optimization in search engines and the candidate MWEs retrieval results on the internet, we designed a revising method to get the MWEs. We took the questions in Sina iask question library as the experimental corpus. And the results show that the precision, recall and the Fmeasure can reach 84%, 52%, 0.64 respectively, which proves the effectiveness of the proposed method.

Key words:  multiword expressions, question interpretation, mutual information, search engine

CLC Number: 

  • TP391.1