Variable selection via partial correlation

Runze Li, Jingyuan Liu, Lejia Lou

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

A partial correlation-based variable selection method was proposed for normal linear regression models by Bühlmann, Kalisch and Maathuis (2010) as an alternative to regularization methods for variable selection. This paper addresses issues related to (a) whether the method is sensitive to the normality assumption, and (b) whether the method is valid when the dimension of predictor increases at an exponential rate in the sample size. To address (a), we study the method for elliptical linear regression models. Our finding indicates that the original proposal can lead to inferior performance when the marginal kurtosis of predictor is not close to that of normal distribution, and simulation results confirm this. To ensure the superior performance of the partial correlation-based variable selection procedure, we propose a thresholded partial correlation (TPC) approach to select significant variables in linear regression models. We establish the selection consistency of the TPC in the presence of ultrahigh dimensional predictors. Since the TPC procedure includes the original proposal as a special case, our results address the issue (b) directly. As a by-product, the sure screening property of the first step of TPC is obtained. Numerical examples illustrate that the TPC is comparable to the commonly-used regularization methods for variable selection.

Original languageEnglish (US)
Pages (from-to)983-996
Number of pages14
JournalStatistica Sinica
Volume27
Issue number3
DOIs
StatePublished - Jul 2017

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Variable selection via partial correlation'. Together they form a unique fingerprint.

Cite this