吉林大学学报(信息科学版) ›› 2014, Vol. 32 ›› Issue (1): 95-100.

• 论文 • 上一篇    下一篇

基于语法树的程序相似度判定方法

石野, 黄龙和, 车天阳, 高斯, 王健   

  1. 吉林大学 计算机科学与技术学院, 长春 130012
  • 收稿日期:2012-11-09 出版日期:2014-01-24 发布日期:2014-04-03
  • 作者简介:石野(1989—), 男, 长春人, 吉林大学硕士研究生, 主要从事程序分析技术、 密码学等研究, (Tel)86-13104454724(E-mail)shiye12@mails.jlu.edu.cn。
  • 基金资助:

    吉林大学“大学生创新性实验计划”国家级基金资助项目(2010A53083)

Program Similarity Detection Based on Syntax Tree

SHI Ye, HUANG Long-he, CHE Tian-yang, GAO Si, WANG Jian   

  1. College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Received:2012-11-09 Online:2014-01-24 Published:2014-04-03

摘要:

针对代码抄袭及软件盗版现象, 研究了3种传统基于程序结构相似性检测方法, 并提出一种基于语法树的程序相似度检测方法。该方法先对源程序进行语法分析得到其语法树, 然后基于语法树重点分析源程序的语法结构并计算其相似度, 从语法结构的角度消除高级抄袭手段所带来的干扰。实验结果表明, 基于语法树的程序相似度检测方法可以较好地检测增加冗余语句、 控制结构的等价替换等10种抄袭手段。

关键词: 抄袭, 程序结构, 相似度检测

Abstract:

In the view of the code clone and software piracy, we analyzes three kinds of traditional similarity detection method based on program structure, then put forward a similarity detection method based on syntax tree of program. In the method, the source code is parsed and a syntax tree is produced for it. Then the similarity of source codes is calculated by analyzing the structures of their syntax trees. The senior copy means are eliminated from the angle of syntax structure. By using a set of plagiarized scripts as testing programs, the experimental results show that the similarity detection method based on syntax tree is effective to detect 10 kinds of plagiarism means code plagiarism, such as adding redundant statements and replacing control structures with equivalent structures.

Key words: code clone, program structure, similarity detection

中图分类号: 

  • TP301