A three-stage approach to identify biomarker signatures for cancer genetic data with survival endpoints

Xue Wu, Chixiang Chen, Zheng Li, Lijun Zhang, Vernon M. Chinchilli, Ming Wang

Research output: Contribution to journalArticlepeer-review


The identification of prognostic and predictive biomarker signatures is crucial for drug development and providing personalized treatment to cancer patients. However, the discovery process often involves high-dimensional candidate biomarkers, leading to inflated family-wise error rates (FWERs) due to multiple hypothesis testing. This is an understudied area, particularly under the survival framework. To address this issue, we propose a novel three-stage approach for identifying significant biomarker signatures, including prognostic biomarkers (main effects) and predictive biomarkers (biomarker-by-treatment interactions), using Cox proportional hazard regression with high-dimensional covariates. To control the FWER, we adopt an adaptive group LASSO for variable screening and selection. We then derive adjusted p-values through multi-splitting and bootstrapping to overcome invalid p values caused by the penalized approach’s restrictions. Our extensive simulations provide empirical evaluation of the FWER and model selection accuracy, demonstrating that our proposed three-stage approach outperforms existing alternatives. Furthermore, we provide detailed proofs and software implementation in R to support our theoretical contributions. Finally, we apply our method to real data from cancer genetic studies.

Original languageEnglish (US)
JournalStatistical Methods and Applications
StateAccepted/In press - 2024

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this