We propose a non-parametric variable selection method which does not rely on any regression model or predictor distribution. The method is based on a new statistical relationship, called additive conditional independence, that has been introduced recently for graphical models. Unlike most existing variable selection methods, which target the mean of the response, the method proposed targets a set of attributes of the response, such as its mean, variance or entire distribution. In addition, the additive nature of this approach offers non-parametric flexibility without employing multi-dimensional kernels. As a result it retains high accuracy for high dimensional predictors. We establish estimation consistency, convergence rate and variable selection consistency of the method proposed. Through simulation comparisons we demonstrate that the method proposed performs better than existing methods when the predictor affects several attributes of the response, and it performs competently in the classical setting where the predictors affect the mean only. We apply the new method to a data set concerning how gene expression levels affect the weight of mice.
|Original language||English (US)|
|Number of pages||19|
|Journal||Journal of the Royal Statistical Society. Series B: Statistical Methodology|
|State||Published - Nov 1 2016|
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty