TY - GEN
T1 - Where does it go? Refining indirect-call targets with multi-layer type analysis
AU - Lu, Kangjie
AU - Hu, Hong
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/11/6
Y1 - 2019/11/6
N2 - System software commonly uses indirect calls to realize dynamic program behaviors. However, indirect-calls also bring challenges to constructing a precise control-flow graph that is a standard prerequisite for many static program-analysis and system-hardening techniques. Unfortunately, identifying indirect-call targets is a hard problem. In particular, modern compilers do not recognize indirect-call targets by default. Existing approaches identify indirect-call targets based on type analysis that matches the types of function pointers and the ones of address-taken functions. Such approaches, however, suffer from a high false-positive rate as many irrelevant functions may share the same types. In this paper, we propose a new approach, namely Multi-Layer Type Analysis (MLTA), to effectively refine indirect-call targets for C/C++ programs. MLTA relies on an observation that function pointers are commonly stored into objects whose types have a multilayer type hierarchy; before indirect calls, function pointers will be loaded from objects with the same type hierarchy “layer by layer”. By matching the multi-layer types of function pointers and functions, MLTA can dramatically refine indirect-call targets. MLTA is effective because multi-layer types are more restrictive than single-layer types. It does not introduce false negatives by conservatively tracking targets propagation between multi-layer types, and the layered design allows MLTA to safely fall back whenever the analysis for a layer becomes infeasible. We have implemented MLTA in a system, namely TypeDive, based on LLVM and extensively evaluated it with the Linux kernel, the FreeBSD kernel, and the Firefox browser. Evaluation results show that TypeDive can eliminate 86% to 98% more indirect-call targets than existing approaches do, without introducing new false negatives. We also demonstrate that TypeDive not only improves the scalability of static analysis but also benefits semantic-bug detection. With TypeDive, we have found 35 new deep semantic bugs in the Linux kernel.
AB - System software commonly uses indirect calls to realize dynamic program behaviors. However, indirect-calls also bring challenges to constructing a precise control-flow graph that is a standard prerequisite for many static program-analysis and system-hardening techniques. Unfortunately, identifying indirect-call targets is a hard problem. In particular, modern compilers do not recognize indirect-call targets by default. Existing approaches identify indirect-call targets based on type analysis that matches the types of function pointers and the ones of address-taken functions. Such approaches, however, suffer from a high false-positive rate as many irrelevant functions may share the same types. In this paper, we propose a new approach, namely Multi-Layer Type Analysis (MLTA), to effectively refine indirect-call targets for C/C++ programs. MLTA relies on an observation that function pointers are commonly stored into objects whose types have a multilayer type hierarchy; before indirect calls, function pointers will be loaded from objects with the same type hierarchy “layer by layer”. By matching the multi-layer types of function pointers and functions, MLTA can dramatically refine indirect-call targets. MLTA is effective because multi-layer types are more restrictive than single-layer types. It does not introduce false negatives by conservatively tracking targets propagation between multi-layer types, and the layered design allows MLTA to safely fall back whenever the analysis for a layer becomes infeasible. We have implemented MLTA in a system, namely TypeDive, based on LLVM and extensively evaluated it with the Linux kernel, the FreeBSD kernel, and the Firefox browser. Evaluation results show that TypeDive can eliminate 86% to 98% more indirect-call targets than existing approaches do, without introducing new false negatives. We also demonstrate that TypeDive not only improves the scalability of static analysis but also benefits semantic-bug detection. With TypeDive, we have found 35 new deep semantic bugs in the Linux kernel.
UR - http://www.scopus.com/inward/record.url?scp=85075946835&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85075946835&partnerID=8YFLogxK
U2 - 10.1145/3319535.3354244
DO - 10.1145/3319535.3354244
M3 - Conference contribution
AN - SCOPUS:85075946835
T3 - Proceedings of the ACM Conference on Computer and Communications Security
SP - 1867
EP - 1881
BT - CCS 2019 - Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security
PB - Association for Computing Machinery
T2 - 26th ACM SIGSAC Conference on Computer and Communications Security, CCS 2019
Y2 - 11 November 2019 through 15 November 2019
ER -