Skip to main navigation Skip to search Skip to main content

A novel bi-level clustering optimization approach to balance treatment of crash data

Research output: Contribution to journalArticlepeer-review

Abstract

Understanding the impact of safety countermeasures on crash outcomes is crucial but challenging. When using cross-sectional data to quantify a countermeasure's effectiveness, underlying differences in road characteristics can lead to imbalances between treated sites and control sites that do not have the countermeasure, which can introduce bias into the evaluation. Propensity score-based matching methods have been widely used in the traffic safety literature to identify treated and control sites with more balanced covariates; however, the use of propensity scores does not guarantee bias between treated and control entities is minimized and its success is highly dependent on propensity score model formulation. To address this issue, this study introduces a novel Bi-Level Clustering Optimization (BLCO) method to match treated and control sites in a way that minimizes imbalance across the two groups. The proposed method utilizes competitive learning to specifically minimize the sum of squares of standardized bias of covariates across the treated and control groups, better simulating the conditions of a randomized trial using non-random observational data. The proposed BLCO method was compared to propensity score matching methods using binary logit regression, random forest algorithms, as well as the genetic matching method. The results demonstrate that the proposed BLCO method significantly outperforms these benchmarks at balancing covariates across treated and control groups, reducing mean absolute standardized bias by 96.16% compared to the unmatched data and achieving an 88.76% improvement over propensity score matching. Additionally, treatment effects of the treated estimated using optimally clustered data showed better model fit compared to the other methods. The proposed method is robust across varying dataset sizes and efficiently handles high-dimensional covariates without transformation, making it applicable to different domains for treatment effect estimation and informed decision-making.

Original languageEnglish (US)
Article number108107
JournalAccident Analysis and Prevention
Volume219
DOIs
StatePublished - Sep 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

All Science Journal Classification (ASJC) codes

  • Human Factors and Ergonomics
  • Safety, Risk, Reliability and Quality
  • Public Health, Environmental and Occupational Health
  • Law

Fingerprint

Dive into the research topics of 'A novel bi-level clustering optimization approach to balance treatment of crash data'. Together they form a unique fingerprint.

Cite this