TY - JOUR
T1 - Large-Scale Third-Party Library Detection in Android Markets
AU - Li, Menghao
AU - Wang, Pei
AU - Wang, Wei
AU - Wang, Shuai
AU - Wu, Dinghao
AU - Liu, Jian
AU - Xue, Rui
AU - Huo, Wei
AU - Zou, Wei
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2020/9/1
Y1 - 2020/9/1
N2 - With the thriving of mobile app markets, third-party libraries are pervasively used in Android applications. The libraries provide functionalities such as advertising, location, and social networking services, making app development much more productive. However, the spread of vulnerable and harmful third-party libraries can also hurt the mobile ecosystem, leading to various security problems. Therefore, third-party library identification has emerged as an important problem, being the basis of many security applications such as repackaging detection, vulnerability identification, and malware analysis. Previously, we proposed a novel approach to identifying third-party Android libraries at a massive scale. Our method uses the internal code dependencies of an app to recognize library candidates and further classify them. With a fine-grained feature hashing strategy, we can better handle code whose package and method names are obfuscated than historical work. We have developed a prototypical tool called LibD and evaluated it with an up-to-date dataset containing 1,427,395 Android apps. Our experiment results show that LibD outperforms existing tools in detecting multi-package third-party libraries with the presence of name-based obfuscation, leading to significantly improved precision without the loss of scalability. In this paper, we extend our early work by investigating the possibility of employing effective and scalable library detection to boost the performance of large-scale app analyses in the real world. We show that the technique of LibD can be used to accelerate whole-app Android vulnerability detection and quickly identify variants of vulnerable third-party libraries. This extension paper sheds light on the practical value of our previous research.
AB - With the thriving of mobile app markets, third-party libraries are pervasively used in Android applications. The libraries provide functionalities such as advertising, location, and social networking services, making app development much more productive. However, the spread of vulnerable and harmful third-party libraries can also hurt the mobile ecosystem, leading to various security problems. Therefore, third-party library identification has emerged as an important problem, being the basis of many security applications such as repackaging detection, vulnerability identification, and malware analysis. Previously, we proposed a novel approach to identifying third-party Android libraries at a massive scale. Our method uses the internal code dependencies of an app to recognize library candidates and further classify them. With a fine-grained feature hashing strategy, we can better handle code whose package and method names are obfuscated than historical work. We have developed a prototypical tool called LibD and evaluated it with an up-to-date dataset containing 1,427,395 Android apps. Our experiment results show that LibD outperforms existing tools in detecting multi-package third-party libraries with the presence of name-based obfuscation, leading to significantly improved precision without the loss of scalability. In this paper, we extend our early work by investigating the possibility of employing effective and scalable library detection to boost the performance of large-scale app analyses in the real world. We show that the technique of LibD can be used to accelerate whole-app Android vulnerability detection and quickly identify variants of vulnerable third-party libraries. This extension paper sheds light on the practical value of our previous research.
UR - http://www.scopus.com/inward/record.url?scp=85054389625&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85054389625&partnerID=8YFLogxK
U2 - 10.1109/TSE.2018.2872958
DO - 10.1109/TSE.2018.2872958
M3 - Article
AN - SCOPUS:85054389625
SN - 0098-5589
VL - 46
SP - 981
EP - 1003
JO - IEEE Transactions on Software Engineering
JF - IEEE Transactions on Software Engineering
IS - 9
M1 - 8478000
ER -