Deviation-based obfuscation-resilient program equivalence checking with application to software plagiarism detection

Jiang Ming, Fangfang Zhang, Dinghao Wu, Peng Liu, Sencun Zhu

Research output: Contribution to journalArticlepeer-review

25 Scopus citations


Software plagiarism, an act of illegally copying others' code, has become a serious concern for honest software companies and the open source community. Considerable research efforts have been dedicated to searching the evidence of software plagiarism. In this paper, we continue this line of research and propose LoPD, a deviation-based program equivalence checking approach, which is an ideal fit for the whole-program plagiarism detection. Instead of directly comparing the similarity between two programs, LoPD searches for any dissimilarity between two programs by finding an input that will cause these two programs to behave differently, either with different output states or with semantically different execution paths. As long as we can find one dissimilarity, the programs are semantically different; but if we cannot find any dissimilarity, it is more likely a plagiarism case. We leverage dynamic symbolic execution to capture the semantics of execution paths and to find path deviations. Compared to the existing detection approaches, LoPD's formal program semantics-based method is more resilient to automatic obfuscation schemes. Our evaluation results indicate that LoPD is effective in detecting whole-program plagiarism. Furthermore, we demonstrate that LoPD can be applied to partial software plagiarism detection as well. The encouraging experiment results show that LoPD is an appealing complement to existing software plagiarism detection approaches.

Original languageEnglish (US)
Article number7490384
Pages (from-to)1647-1664
Number of pages18
JournalIEEE Transactions on Reliability
Issue number4
StatePublished - Dec 2016

All Science Journal Classification (ASJC) codes

  • Safety, Risk, Reliability and Quality
  • Electrical and Electronic Engineering


Dive into the research topics of 'Deviation-based obfuscation-resilient program equivalence checking with application to software plagiarism detection'. Together they form a unique fingerprint.

Cite this