Journal of Jilin University(Information Science Ed ›› 2016, Vol. 34 ›› Issue (4): 543-549.
Previous Articles Next Articles
DONG Yaze a, LI Wanlong b, LI Hang b, ZHENG Shanhong b
Received:
Online:
Published:
Abstract: How to improve the accuracy and precision of search engine in the Internet Era is the key problem needed to be solved urgently. Based on the basic model of the suffix tree clustering algorithm, an improved search results clustering algorithm based on suffix tree is proposed, in which Vector space model is combined with suffix tree clustering to improve the effect of the base class merge. Otherwise, the number of the texts corresponding to base class node, word length included in the phrase, phrase weight and whether it contains the query terms are combined as the seletion condition of clustering label. It improves the rationality and readability of the clustering labels consquently. Finally, the method is testified by using the text classification corpus data in the Sogou corpus. The experimental results show that the method can improve the accuracy of clustering results to a certain extent.
Key words: suffix tree, text clustering, Web retrieval results, vector space model
CLC Number:
DONG Yaze, LI Wanlong, LI Hang, ZHENG Shanhong. Improved Algorithm of Web Retrieve Results Clustering Based on Suffix Tree[J].Journal of Jilin University(Information Science Ed, 2016, 34(4): 543-549.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://xuebao.jlu.edu.cn/xxb/EN/
http://xuebao.jlu.edu.cn/xxb/EN/Y2016/V34/I4/543
Cited