J4 ›› 2009, Vol. 47 ›› Issue (4): 790-794.
Previous Articles Next Articles
XU Pei-Juan, LI Xiong-Fei, HUI Yue, ZHANG Gui-Lin
Received:
Online:
Published:
Contact:
Abstract:
On the basis of the analysis of the process of dealing with the Chinese word segmentation ambiguity, this paper covers bidirectional sc an word segmentation algorithm based on the context. In order to improve the word segmentation dictionary, the authors put the fixed phrase into the dictionary and discussed the feature selection and the weighting schema enactment in detail. In order to solve the problem of general TFIDF weighting schema at present, we took statistics into consideration, and meanwhile put up the itemscoring method which improves the efficiency of the feature item about text categorization. At last we proved the advantage of the improved weighting schema through test.
Key words: text categorization; context bidirectional scan; vector space model; weighting schema; feature selection
CLC Number:
XU Pei-Juan, LI Xiong-Fei, HUI Yue, ZHANG Gui-Lin. Research and Implementation of Related Algorithm ofChinese Text Categorization[J].J4, 2009, 47(4): 790-794.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://xuebao.jlu.edu.cn/lxb/EN/
http://xuebao.jlu.edu.cn/lxb/EN/Y2009/V47/I4/790
Cited