TY - JOUR
T1 - SHErrLoc
T2 - A static holistic error locator
AU - Zhang, Danfeng
AU - Myers, Andrew C.
AU - Vytiniotis, Dimitrios
AU - Peyton-Jones, Simon
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017/8
Y1 - 2017/8
N2 - We introduce a general way to locate programmer mistakes that are detected by static analyses. The program analysis is expressed in a general constraint language that is powerful enough to model type checking, information flow analysis, dataflow analysis, and points-to analysis. Mistakes in program analysis result in unsatisfiable constraints. Given an unsatisfiable system of constraints, both satisfiable and unsatisfiable constraints are analyzed to identify the program expressions most likely to be the cause of unsatisfiability. The likelihood of different error explanations is evaluated under the assumption that the programmer's code is mostly correct, so the simplest explanations are chosen, following Bayesian principles. For analyses that rely on programmer-stated assumptions, the diagnosis also identifies assumptions likely to have been omitted. The new error diagnosis approach has been implemented as a tool called SHErrLoc, which is applied to three very different program analyses, such as type inference for a highly expressive type system implemented by the Glasgow Haskell Compiler-including type classes, Generalized Algebraic Data Types (GADTs), and type families. The effectiveness of the approach is evaluated using previously collected programs containing errors. The results show that when compared to existing compilers and other tools, SHErrLoc consistently identifies the location of programmer errors significantly more accurately, without any language-specific heuristics.
AB - We introduce a general way to locate programmer mistakes that are detected by static analyses. The program analysis is expressed in a general constraint language that is powerful enough to model type checking, information flow analysis, dataflow analysis, and points-to analysis. Mistakes in program analysis result in unsatisfiable constraints. Given an unsatisfiable system of constraints, both satisfiable and unsatisfiable constraints are analyzed to identify the program expressions most likely to be the cause of unsatisfiability. The likelihood of different error explanations is evaluated under the assumption that the programmer's code is mostly correct, so the simplest explanations are chosen, following Bayesian principles. For analyses that rely on programmer-stated assumptions, the diagnosis also identifies assumptions likely to have been omitted. The new error diagnosis approach has been implemented as a tool called SHErrLoc, which is applied to three very different program analyses, such as type inference for a highly expressive type system implemented by the Glasgow Haskell Compiler-including type classes, Generalized Algebraic Data Types (GADTs), and type families. The effectiveness of the approach is evaluated using previously collected programs containing errors. The results show that when compared to existing compilers and other tools, SHErrLoc consistently identifies the location of programmer errors significantly more accurately, without any language-specific heuristics.
UR - http://www.scopus.com/inward/record.url?scp=85028501444&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85028501444&partnerID=8YFLogxK
U2 - 10.1145/3121137
DO - 10.1145/3121137
M3 - Article
AN - SCOPUS:85028501444
SN - 0164-0925
VL - 39
JO - ACM Transactions on Programming Languages and Systems
JF - ACM Transactions on Programming Languages and Systems
IS - 4
M1 - 18
ER -