TY - GEN
T1 - VMhunt
T2 - 25th ACM Conference on Computer and Communications Security, CCS 2018
AU - Xu, Dongpeng
AU - Ming, Jiang
AU - Fu, Yu
AU - Wu, Dinghao
N1 - Funding Information:
We thank the CCS anonymous reviewers and Heng Yin for their valuable feedback. This research was supported in part by the National Science Foundation (NSF) grants CNS-1652790, and the Office of Naval Research (ONR) grants N00014-16-1-2265, N00014-16-1-2912, and N00014-17-1-2894. Jiang Ming was also supported by the University of Texas System STARs Program.
Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/10/15
Y1 - 2018/10/15
N2 - Code virtualization is a highly sophisticated obfuscation technique adopted by malware authors to stay under the radar. However, the increasing complexity of code virtualization also becomes a “double-edged sword” for practical application. Due to its performance limitations and compatibility problems, code virtualization is seldom used on an entire program. Rather, it is mainly used only to safeguard the key parts of code such as security checks and encryption keys. Many techniques have been proposed to reverse engineer the virtualized code, but they share some common limitations. They assume the scope of virtualized code is known in advance and mainly focus on the classic structure of code emulator. Also, few work verifies the correctness of their deobfuscation results. In this paper, with fewer assumptions on the type and scope of code virtualization, we present a verifiable method to address the challenge of partially-virtualized binary code simplification. Our key insight is that code virtualization is a kind of process-level virtual machine (VM), and the context switch patterns when entering and exiting the VM can be used to detect the VM boundaries. Based on the scope of VM boundary, we simplify the virtualized code. We first ignore all the instructions in a given virtualized snippet that do not affect the final result of that snippet. To better revert the data obfuscation effect that encodes a variable through bitwise operations, we then run a new symbolic execution called multiple granularity symbolic execution to further simplify the trace snippet. The generated concise symbolic formulas facilitate the correctness testing of our simplification results. We have implemented our idea as an open source tool, VMHunt, and evaluated it with real-world applications and malware. The encouraging experimental results demonstrate that VMHunt is a significant improvement over the state of the art.
AB - Code virtualization is a highly sophisticated obfuscation technique adopted by malware authors to stay under the radar. However, the increasing complexity of code virtualization also becomes a “double-edged sword” for practical application. Due to its performance limitations and compatibility problems, code virtualization is seldom used on an entire program. Rather, it is mainly used only to safeguard the key parts of code such as security checks and encryption keys. Many techniques have been proposed to reverse engineer the virtualized code, but they share some common limitations. They assume the scope of virtualized code is known in advance and mainly focus on the classic structure of code emulator. Also, few work verifies the correctness of their deobfuscation results. In this paper, with fewer assumptions on the type and scope of code virtualization, we present a verifiable method to address the challenge of partially-virtualized binary code simplification. Our key insight is that code virtualization is a kind of process-level virtual machine (VM), and the context switch patterns when entering and exiting the VM can be used to detect the VM boundaries. Based on the scope of VM boundary, we simplify the virtualized code. We first ignore all the instructions in a given virtualized snippet that do not affect the final result of that snippet. To better revert the data obfuscation effect that encodes a variable through bitwise operations, we then run a new symbolic execution called multiple granularity symbolic execution to further simplify the trace snippet. The generated concise symbolic formulas facilitate the correctness testing of our simplification results. We have implemented our idea as an open source tool, VMHunt, and evaluated it with real-world applications and malware. The encouraging experimental results demonstrate that VMHunt is a significant improvement over the state of the art.
UR - http://www.scopus.com/inward/record.url?scp=85056907796&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85056907796&partnerID=8YFLogxK
U2 - 10.1145/3243734.3243827
DO - 10.1145/3243734.3243827
M3 - Conference contribution
AN - SCOPUS:85056907796
T3 - Proceedings of the ACM Conference on Computer and Communications Security
SP - 442
EP - 458
BT - CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security
PB - Association for Computing Machinery
Y2 - 15 October 2018
ER -