Crash frequency modeling has been an active research topic in traffic safety, for which various techniques have been proposed that can be loosely classified as either statistical models or machine learning (ML) methods. Statistical models are suitable for drawing inferences and producing relationships that are verifiable by domain experts. However, they generally suffer from low predictive performance due to built-in assumptions about the crash data and adherence to prespecified functional forms. On the other hand, ML methods are data-driven and free from pre-supposed conditions on the dataset, yet they are often not interpretable. In this paper, a combination scheme is proposed to leverage the advantages of both techniques, and it is evaluated using crash data collected from urban highways in the state of Washington. The results show that this combination scheme could significantly improve the predictive performance and model fitness of statistical models without adversely impacting their interpretability.
All Science Journal Classification (ASJC) codes