Abstract
Sparse high-dimensional massive sample size (sHDMSS) time-to-event data present multiple challenges to quantitative researchers as most current sparse survival regression methods and software will grind to a halt and become practically inoperable. This paper develops a scalable ℓ0-based sparse Cox regression tool for right-censored time-to-event data that easily takes advantage of existing high performance implementation of ℓ2-penalized regression method for sHDMSS time-to-event data. Specifically, we extend the ℓ0-based broken adaptive ridge (BAR) methodology to the Cox model, which involves repeatedly performing reweighted ℓ2-penalized regression. We rigorously show that the resulting estimator for the Cox model is selection consistent, oracle for parameter estimation, and has a grouping property for highly correlated covariates. Furthermore, we implement our BAR method in an R package for sHDMSS time-to-event data by leveraging existing efficient algorithms for massive ℓ2-penalized Cox regression. We evaluate the BAR Cox regression method by extensive simulations and illustrate its application on an sHDMSS time-to-event data from the National Trauma Data Bank with hundreds of thousands of observations and tens of thousands sparsely represented covariates.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 675-686 |
| Number of pages | 12 |
| Journal | Statistics in Medicine |
| Volume | 39 |
| Issue number | 6 |
| DOIs | |
| State | Published - Mar 15 2020 |
All Science Journal Classification (ASJC) codes
- Epidemiology
- Statistics and Probability
Fingerprint
Dive into the research topics of 'A surrogate ℓ0 sparse Cox's regression with applications to sparse high-dimensional massive sample size time-to-event data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver