Achieving accuracy and scalability simultaneously in detecting application clones on Android markets

Kai Chen, Peng Liu, Yingjun Zhang

Research output: Contribution to journalConference articlepeer-review

245 Scopus citations


Besides traditional problems such as potential bugs, (smartphone) application clones on Android markets bring new threats. That is, attackers clone the code from legitimate Android applications, assemble it with malicious code or advertisements, and publish these ''purpose-added" app clones on the same or other markets for benefits. Three inherent and unique characteristics make app clones difficult to detect by existing techniques: a billion opcode problem caused by cross-market publishing, gap between code clones and app clones, and prevalent Type 2 and Type 3 clones. Existing techniques achieve either accuracy or scalability, but not both. To achieve both goals, we use a geometry characteristic, called centroid, of dependency graphs to measure the similarity between methods (code fragments) in two apps. Then we synthesize the method-level similarities and draw a Y/N conclusion on app (core functionality) cloning. The observed ''centroid effect" and the inherent ''monotonicity" property enable our approach to achieve both high accuracy and scalability. We implemented the app clone detection system and evaluated it on five whole Android markets (including 150,145 apps, 203 million methods and 26 billion opcodes). It takes less than one hour to perform cross-market app clone detection on the five markets after generating centroids only once.

Original languageEnglish (US)
Pages (from-to)175-186
Number of pages12
JournalProceedings - International Conference on Software Engineering
Issue number1
StatePublished - May 31 2014
Event36th International Conference on Software Engineering, ICSE 2014 - Hyderabad, India
Duration: May 31 2014Jun 7 2014

All Science Journal Classification (ASJC) codes

  • Software


Dive into the research topics of 'Achieving accuracy and scalability simultaneously in detecting application clones on Android markets'. Together they form a unique fingerprint.

Cite this