ABSLearn: a GNN-based framework for aliasing and buffer-size information retrieval

Ke Liang, Jim Tan, Dongrui Zeng, Yongzhe Huang, Xiaolei Huang, Gang Tan

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Inferring aliasing and buffer-size information is important to understanding a C program's memory layout, which is critical to program analysis and security-related tasks. However, traditional static and dynamic program analysis methods suffer from certain limitations: static alias analysis methods suffer from precision loss and have poor scalability. Meanwhile, although dynamic analysis can achieve high precision, there is no soundness guarantee, and an online analysis may cause non-negligible runtime overhead. Besides, the current methods can only capture aliasing information. As for the buffer-size relational information, which is the specific variable storing the size of the buffer pointed by the pointers, it is tough to analyze by traditional methods. Moreover, we observe that most methods are designed for specific information. To address these limitations, we present ABSLearn, a deep learning framework that is capable of retrieving both aliasing and buffer-size information from C programs. The core idea of ABSLearn is to formulate the information retrieval as a link prediction problem, where a Graph Neural Network (GNN) model is applied to solve the problem. We developed the first related dataset that contains 285 C program samples to train ABSLearn. Then, the trained model is applied to infer the information on three practical benchmarks: Gzip-1.2.4, Make-3.80, and Tar-1.15.1. The results show that ABSLearn achieves comparable performance and excellent runtime performance. As the first attempt at applying GNN to infer aliasing and buffer-size information, ABSLearn can potentially benefit future program analysis frameworks.

Original languageEnglish (US)
Pages (from-to)1171-1189
Number of pages19
JournalPattern Analysis and Applications
Volume26
Issue number3
DOIs
StatePublished - Aug 2023

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Cite this