Overcoming challenges in crash prediction modeling using discretized duration approach: An investigation of sampling approaches

Diwas Thapa, Rajesh Paleti, Sabyasachee Mishra

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Until recently, statistical approaches used for real-time crash prediction modeling were limited to case-control design and “sampling of alternatives” approaches. A recent study has developed a duration-based real-time crash prediction model capable of incorporating dynamic (time-varying) covariates within its framework. The modeling approach discretizes the duration between crashes into equal time intervals which can be modeled as alternatives in a multinomial logit framework. The approach, however, requires a reformulation of the original crash dataset to fit its modeling framework which results in considerably large data making model estimation computationally demanding. Additionally, validation of the model in the original study is based on crash data from just one interstate, I-405, assuming homogenous highway segments each 5 miles in length. This study improves upon the original study by investigating sampling techniques that can be applied to the reformulated data to reduce computational load using 2019 crash data from two interstates, I-40 and I-55, in Memphis, Tennessee. Furthermore, discretization of inter-crash duration is undertaken following non-homogenous segmentation of the interstates that is based on highway geometry, terrain, and posted speed limit. To accomplish the study objectives, a relatively small future window of 1 h with 15-minute time intervals is used to discretize the inter-crash duration and obtain the reformulated data. Sampling of crashes for model estimation is then done at the crash, epoch, and segment levels to answer the question of which sampling technique (by crash, epoch, or segment) would result in reasonable accuracy when compared with the complete (100%) data. Results show that 25% of samples drawn at the epoch level can result in a considerable reduction of computational load while providing reasonably consistent estimates.

Original languageEnglish (US)
Article number106639
JournalAccident Analysis and Prevention
StatePublished - May 2022

All Science Journal Classification (ASJC) codes

  • Human Factors and Ergonomics
  • Safety, Risk, Reliability and Quality
  • Public Health, Environmental and Occupational Health


Dive into the research topics of 'Overcoming challenges in crash prediction modeling using discretized duration approach: An investigation of sampling approaches'. Together they form a unique fingerprint.

Cite this