Semantics-Directed Binary Reverse Engineering and Transformation Validation

  • Tan, Gang G. (PI)

Project: Research project

Project Details

Description

The ability to analyze and transform Commercial Off-The-Shelf (COTS) binaries without access to source code hasalways been a highly' sought-after goal for computer-security practitioners. COTS binaries are prevalent in and criticalto many computer systems, includ''ing Navy~s cyber systems and devices. Automated tools that can reverse engineerlegacy binaries, apply transformations, and validate'' the transformations will enable defenders to secure legacy binarieswithout laborious manual efforts. However, most state-of-the-ar''t binary-level defense techniques are based onheuristics, make heavy assumptions about compilation toolchains, and do not handle co''de obfuscation.In this project, we aim to push forward binary-level defense techniques in two tasks. The first task is binary-level'reverse engineering. Basic information about binaries such as control-flow and data-flow information is a prerequisitefor all binary analysis and transformation. Previous reverse-engineering techniques are often ad hoc and do not have aformal basis. There is also a lack of rigorous evaluation about what would be the best reverse-engineering algorithmsin termsof precision and performance. We propose a binary-level reverse-engineering tool using the help of Datalog. Variousalgorithms for control-flow and data-flow recovering will be implemented as Datalog rules and solved using a Datalogengine. The use of Datalog enables high-precision reverse engineering and a principled exploration of the designspace of reverse-engineering algorithms. The second task is transformation vali'dation. After a transformation such asfeature removal or security hardening has been applied to a piece of binary, it is important'' to check that thetransformed binary is equivalent to the old one, as a way of ensuring that important functionalities are not lost'' bytransformation. For this task, we plan to apply automated theorem-proving techniques.Both of the tasks will be guided by the PI''~s previous work on formal semantics of machine architectures, making theapproach built on a rigorous foundation and robust to code'' obfuscation. In particular, assembly instructions will betransformed to an intermediate representation, called RTL, before being a''nalyzed for control-flow construction, dataflowconstruction, and transformation validation.If the project is successful, we will p''roduce a reverse engineering tool that has the ability to construct highly accurateinformation about binaries, which can benefit bo''th attack analysis and defense deployment. We will also produce apractical transformation validation tool, which can validate the c'orrectness of transformation including securityhardening and feature removal.The research team will be led by Dr. Gang Tan from Penn State University. He is a wellrecognized expert in programanalysis and software security. His group has published plenty of top-notch papers closely related to the proposedtasks. The team~s expertise and track record clearly demonstrate the ability to perform groundbreaking research in thearea of binary-level reverse engineering and transformation validation.

StatusActive
Effective start/end date4/18/17 → …

Funding

  • U.S. Navy: $500,000.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.