TY - GEN
T1 - An In-depth Analysis of Duplicated Linux Kernel Bug Reports
AU - Mu, Dongliang
AU - Wu, Yuhang
AU - Chen, Yueqi
AU - Lin, Zhenpeng
AU - Yu, Chensheng
AU - Xing, Xinyu
AU - Wang, Gang
N1 - Publisher Copyright:
© 2022 29th Annual Network and Distributed System Security Symposium, NDSS 2022. All Rights Reserved.
PY - 2022
Y1 - 2022
N2 - In the past three years, the continuous fuzzing projects Syzkaller and Syzbot have achieved great success in detecting kernel vulnerabilities, finding more kernel bugs than those found in the past 20 years. However, a side effect of continuous fuzzing is that it generates an excessive number of crash reports, many of which are “duplicated” reports caused by the same bug. While Syzbot uses a simple heuristic to group (deduplicate) reports, we find that it is often inaccurate. In this paper, we empirically analyze the duplicated kernel bug reports to understand: (1) the prevalence of duplication; (2) the potential costs introduced by duplication; and (3) the key causes behind the duplication problem. We collected all of the fixed kernel bugs from September 2017 to November 2020, including 3.24 million crash reports grouped by Syzbot under 2,526 bug reports (identified by unique bug titles). We found the bug reports indeed had duplication: 47.1% of the 2,526 bug reports are duplicated with one or more other reports. By analyzing the metadata of these reports, we found undetected duplication introduced extra costs in terms of time and developer efforts. Then we organized Linux kernel experts to analyze a sample of duplicated bugs (375 bug reports, unique 120 bugs) and identified 6 key contributing factors to the duplication. Based on these empirical findings, we proposed and prototyped actionable strategies for bug deduplication. After confirming their effectiveness using a ground-truth dataset, we further applied our methods and identified previously unknown duplication cases among open bugs.
AB - In the past three years, the continuous fuzzing projects Syzkaller and Syzbot have achieved great success in detecting kernel vulnerabilities, finding more kernel bugs than those found in the past 20 years. However, a side effect of continuous fuzzing is that it generates an excessive number of crash reports, many of which are “duplicated” reports caused by the same bug. While Syzbot uses a simple heuristic to group (deduplicate) reports, we find that it is often inaccurate. In this paper, we empirically analyze the duplicated kernel bug reports to understand: (1) the prevalence of duplication; (2) the potential costs introduced by duplication; and (3) the key causes behind the duplication problem. We collected all of the fixed kernel bugs from September 2017 to November 2020, including 3.24 million crash reports grouped by Syzbot under 2,526 bug reports (identified by unique bug titles). We found the bug reports indeed had duplication: 47.1% of the 2,526 bug reports are duplicated with one or more other reports. By analyzing the metadata of these reports, we found undetected duplication introduced extra costs in terms of time and developer efforts. Then we organized Linux kernel experts to analyze a sample of duplicated bugs (375 bug reports, unique 120 bugs) and identified 6 key contributing factors to the duplication. Based on these empirical findings, we proposed and prototyped actionable strategies for bug deduplication. After confirming their effectiveness using a ground-truth dataset, we further applied our methods and identified previously unknown duplication cases among open bugs.
UR - http://www.scopus.com/inward/record.url?scp=85167925886&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85167925886&partnerID=8YFLogxK
U2 - 10.14722/ndss.2022.24159
DO - 10.14722/ndss.2022.24159
M3 - Conference contribution
AN - SCOPUS:85167925886
T3 - 29th Annual Network and Distributed System Security Symposium, NDSS 2022
BT - 29th Annual Network and Distributed System Security Symposium, NDSS 2022
PB - The Internet Society
T2 - 29th Annual Network and Distributed System Security Symposium, NDSS 2022
Y2 - 24 April 2022 through 28 April 2022
ER -