J4

• 计算机科学 • 上一篇    下一篇

利用水平分割法计算给定串中的所有Maximal(NE/SNE) Repeats

袁 哲1, 赵永哲1, 张文睿2, 朱祥彬1, 赵东伟1   

  1. 1. 吉林大学 计算机科学与技术学院, 长春 130012;2. 吉林移动通信有限公司 吉林市分公司, 吉林 吉林 132001
  • 收稿日期:2008-02-25 修回日期:1900-01-01 出版日期:2008-09-26 发布日期:2008-09-26
  • 通讯作者: 赵永哲

Compute All Maximal(NE/SNE) Repeats in a Stringwith Horizontaldivision Method

YUAN Zhe1, ZHAO Yongzhe1, ZHANG Wenrui2, ZHU Xiangbin1, ZHAO Dongwei1   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, China;2. Jilin Filiale of Jilin Mobile Communication Co., Ltd, Jilin 132001, Jilin Province, China
  • Received:2008-02-25 Revised:1900-01-01 Online:2008-09-26 Published:2008-09-26
  • Contact: ZHAO Yongzhe

摘要: 提出一种利用给定符号串x[1…n]的后缀数组和最 长公共前缀数组求x所有最大重复的新方法〖CD2〗水平分割法. 通过对x的最大不可扩展重复和最大超级不可扩展重复所有可能出现的位置以及判定条件的提炼, 分别给出仅由x的后缀数组和最长公共前缀数组求x的所有最大重复、 最大不可扩展重复和最大超级不可扩展重复的算法. 该算法克服了除后缀数组和最长公共前缀数组外, 还需利用其他辅助数组的缺陷, 降低了空间开销, 且时间复杂度没有增加, 并可以在对最长公共前缀数组仅进行一次扫描的情况下求出给定串的所有最大重复、 最大不可扩展重复和最大超级不可扩展重复.

关键词: 重复(子串), 后缀数组, 水平分割法

Abstract: We proposed a new methodhorizontaldivision method by which we can compute all the Maximal Repeats of string x using only suffix array SAx and LCP array LCPx. We analyzed the situatio ns and locations where the Maximal NERepeats and SNERepeats of xcan be. Then we designed three algorithms by which all Maximal Repeats, Maximal NERepeats, and Maximal SNERepeats in a string x[1…n] can be computed onlyby means of SAx and LCPx. The given algorithms overcome the defects of the corresponding algorithms which require other assistant arrays in addition to suffix array and LCP array. So our algorithms reduce the space requirement greatly. Moreover, the time complexity of these algorithms is not increased. In addition, we can get all the Ma ximal Repeats, Maximal NE Repeats and Maximal SNE Repeats of a string by only scanningLCP array once.

Key words: repeats, suffix array, horizontadivision method

中图分类号: 

  • TP301