一种基于WordNet和Corpus Statistics的语义相似性计算方法

J4 ›› 2010, Vol. 48 ›› Issue (05): 811-816.

Previous Articles Next Articles

A Semantic Similarity Computing Approach Based onWordNet and Corpus Statistics

ZHANG Dongna, ZHOU Chunguang, LIU Yanbin, GUO Dongwei

College of Computer Science and Technology, Jilin University, Chan
gchun 130012, China

Received:2009-12-15 Online:2010-09-26 Published:2010-09-21
Contact: GUO Dongwei E-mail:guodw@jlu.edu.cn

Abstract

Abstract:

We first proposed a new method calculating semantic similarity parameter information content. The new algorithm is based on the concept semantic information in the knowledge base called WordNet and the probability in the corpus called selfinformation. Then, considering the existing algorithms are all domainrelated and the calculating processes are complicated, we proposed a universal method based on corpus statistics and WordNet calculating semantic similarity which can be used in information extraction, information retrieval, document clustering and ontology learning. The proposed method makes a substantial improvement experimenting on the benchmark data setR&B concept pairs.

Key words: semantic similarity of concepts, Brown corpus, information content method

CLC Number:

TP391.1

ZHANG Dong-Na, ZHOU Chun-Guang, LIU Pan-Bin, GUO Dong-Wei. A Semantic Similarity Computing Approach Based onWordNet and Corpus Statistics[J].J4, 2010, 48(05): 811-816.

[1]	ZHANG Kai-Yong, ZHOU Chun-Guang, WANG Kang-Beng, GUO Dong-Wei, DI Yan-Dong. An Information Content Method Based on Extending Relat [J]. J4, 2011, 49(06): 1068-1072.
[2]	GUO Dong-Wei, LI San-Xi, ZHANG Zhong-Meng, LIU Miao. Classification of Deep Web Based on Model Matching [J]. J4, 2011, 49(03): 487-492.
[3]	DIAO Gang, GUO Dong-Wei, LI Dan. Dynamic Web Information Extraction Based onSequence Alignment [J]. J4, 2010, 48(03): 421-426.
[4]	XU Pei-Juan, LI Xiong-Fei, HUI Yue, ZHANG Gui-Lin. Research and Implementation of Related Algorithm ofChinese Text Categorization [J]. J4, 2009, 47(4): 790-794.

A Semantic Similarity Computing Approach Based onWordNet and Corpus Statistics

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 4

Metrics

Comments

Recommended 10