J4 ›› 2013, Vol. 31 ›› Issue (2): 183-186.

• 论文 • 上一篇    下一篇

基于后缀数组改进的全文索引结构研究

刘畅1, 张猛2   

  1. 1. 吉林工商学院 信息工程学院, 长春 130062; 2. 吉林大学 网络中心, 长春 130012
  • 收稿日期:2011-11-26 出版日期:2013-03-23 发布日期:2013-06-05
  • 作者简介:刘畅(1978—), 女, 长春人, 吉林工商学院讲师, 主要从事计算机网络研究, (Tel)86-13844176483(E-mail)liuchang8023@sina.com.cn。
  • 基金资助:

    吉林省教育厅科技发展规划基金资助项目(2012373)

Improved Suffix Array-Based Full-Text Indexing Structures

LIU Chang1, ZHANG Meng2   

  1. 1. Department of Information Engineering, Jilin Business and Technology College, Changchun 130062, China;2. Network Center, Jilin University, Changchun 130012, China
  • Received:2011-11-26 Online:2013-03-23 Published:2013-06-05

摘要:

为在网络数据中搜索到所需相关数据, 通过对基于后缀数组的全文索引结构的改进研究, 设计和实现一种降低空间占用率并有效提高索引速度的全文索引结构加权有向词图。通过实验证明, 加权有向词图在相同问题规模下能降低存储空间, 同时不影响检索的效率, 是一种更为高效的全文索引结构。

关键词: 后缀自动机, 全文索引结构, 后缀自动机

Abstract:

How to search the data needed in the vast network data becomes the dominant Web search technology. Study on effective information retrieval algorithms and data structures becomes an important issue in this article suffix array-based full-text indexing structure. The goal is to design and implement a reduce space occupancy rate and effective full-text indexing speed to improve the index structure WDWG (Weighted Directed Word Graph). Experiments show that the WDWG with the same size of the problem can reduce the word graph storage space, while not affecting the retrieval efficiency, a more efficient full-text index structure.

Key words: suffix automaton, full-text index structure, suffix automaton

中图分类号: 

  • TP311|G354