TY - JOUR
T1 - Evaluating the predictive power of an SPF for two-lane rural roads with random parameters on out-of-sample observations
AU - Tang, Houjun
AU - Gayah, Vikash V.
AU - Donnell, Eric T.
N1 - Publisher Copyright:
© 2019 Elsevier Ltd
PY - 2019/11
Y1 - 2019/11
N2 - Negative binomial (NB) regression is among the most common statistical modeling methods used to model crash frequencies due to its simple functional form and ability to handle over-dispersion commonly found in crash data. However, a drawback of this approach is that regression parameters are assumed to be the same across observations, which could contribute to biased parameter estimates. To alleviate this concern, the random parameters negative binomial (RPNB) model was recently proposed, which allows regression parameters to differ across observations following some known distribution. The resulting coefficients should be less biased, and thus the RPNB approach is believed to provide a more accurate relationship between independent variables and expected crash frequency. However, the prediction accuracy of the RPNB model relative to the standard NB model has not been thoroughly evaluated, particularly with respect to out-of-sample observations for which unique random parameters cannot be estimated. In this paper, the predictive power of the RPNB and NB models are examined using two-lane rural highway data from three engineering Districts in Pennsylvania. Multiple evaluation metrics are applied—root-mean-square error (RMSE) and mean absolute error (MAE), coefficients from calibration functions and cumulative residual (CURE) plots—to assess each model type. The results show that the RPNB model outperforms the NB model when applied to within sample observations (i.e., those used to estimate the model) by making use of the observation-specific coefficients. However, the predictive power of the RPNB model appears to be similar to or slightly less precise than the traditional NB model when applied to out-of-sample observations. Since the RPNB model is estimated using a simulation-based approach, sensitivity tests were also performed to see how the parameter estimates change with the number of Halton draws used to perform the simulation. For the sample sizes used in this paper, the estimates were fairly insensitive when more than 50 Halton draws were used. The findings suggest that the RPNB model is more reliable when applied to the same set of sites that were used to estimate the model but might not be as robust as the traditional NB model when applied to other sites.
AB - Negative binomial (NB) regression is among the most common statistical modeling methods used to model crash frequencies due to its simple functional form and ability to handle over-dispersion commonly found in crash data. However, a drawback of this approach is that regression parameters are assumed to be the same across observations, which could contribute to biased parameter estimates. To alleviate this concern, the random parameters negative binomial (RPNB) model was recently proposed, which allows regression parameters to differ across observations following some known distribution. The resulting coefficients should be less biased, and thus the RPNB approach is believed to provide a more accurate relationship between independent variables and expected crash frequency. However, the prediction accuracy of the RPNB model relative to the standard NB model has not been thoroughly evaluated, particularly with respect to out-of-sample observations for which unique random parameters cannot be estimated. In this paper, the predictive power of the RPNB and NB models are examined using two-lane rural highway data from three engineering Districts in Pennsylvania. Multiple evaluation metrics are applied—root-mean-square error (RMSE) and mean absolute error (MAE), coefficients from calibration functions and cumulative residual (CURE) plots—to assess each model type. The results show that the RPNB model outperforms the NB model when applied to within sample observations (i.e., those used to estimate the model) by making use of the observation-specific coefficients. However, the predictive power of the RPNB model appears to be similar to or slightly less precise than the traditional NB model when applied to out-of-sample observations. Since the RPNB model is estimated using a simulation-based approach, sensitivity tests were also performed to see how the parameter estimates change with the number of Halton draws used to perform the simulation. For the sample sizes used in this paper, the estimates were fairly insensitive when more than 50 Halton draws were used. The findings suggest that the RPNB model is more reliable when applied to the same set of sites that were used to estimate the model but might not be as robust as the traditional NB model when applied to other sites.
UR - http://www.scopus.com/inward/record.url?scp=85071165867&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071165867&partnerID=8YFLogxK
U2 - 10.1016/j.aap.2019.105275
DO - 10.1016/j.aap.2019.105275
M3 - Article
C2 - 31465933
AN - SCOPUS:85071165867
SN - 0001-4575
VL - 132
JO - Accident Analysis and Prevention
JF - Accident Analysis and Prevention
M1 - 105275
ER -