J4
• 计算机科学 • Previous Articles Next Articles
ZHANG Chang-li, HE Feng-ling, ZUO Wan-li
Received:
Revised:
Online:
Published:
Contact:
Abstract: An automatic and dictionary-free Chinese word segmentation method based on suffix array algorithm is proposed. By the algorithm based on suffix array and by using HashMap the co-occurrence patterns of Chinese characters are gotten, and Chinese words are filtered through confidence. Experiment results show that by the algorithm one can acquire high frequency lexical items effectively and efficiently without the help of the dictionary and corpus as well. This method is particularly suitable for lexical-frequency-sensitive as well as time-critical Chinese information processing application.
Key words: Chinese information processing, automatic Chinese word segmentation, suffix array, HashMap
CLC Number:
ZHANG Chang-li, HE Feng-ling, ZUO Wan-li. An automatic and dictionary-free Chinese wordsegmentation method based on suffix array[J].J4, 2004, 42(04): 548-553.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://xuebao.jlu.edu.cn/lxb/EN/
http://xuebao.jlu.edu.cn/lxb/EN/Y2004/V42/I04/548
Cited