Abstract
Feature screening is a key step in handling ultrahigh dimensional data sets that are ubiquitous in modern statistical problems. Over the last decade, convex relaxation based approaches (e.g., Lasso/sparse additive model) have been extensively developed and analyzed for feature selection in high dimensional regime. But in the ultrahigh dimensional regime, these approaches suffer from several problems, both computationally and statistically. To overcome these issues, in this paper, we propose a novel Hilbert space embedding based approach to independence screening for ultrahigh dimensional data sets. The proposed approach is model-free (i.e., no model assumption is made between response and predictors) and could handle non-standard (e.g., graphs) and multivariate outputs directly. We establish the sure screening property of the proposed approach in the ultrahigh dimensional regime, and experimentally demonstrate its advantages and superiority over other approaches on several synthetic and real data sets.
Original language | English (US) |
---|---|
Pages (from-to) | 126-134 |
Number of pages | 9 |
Journal | Journal of Machine Learning Research |
Volume | 31 |
State | Published - 2013 |
Event | 16th International Conference on Artificial Intelligence and Statistics, AISTATS 2013 - Scottsdale, United States Duration: Apr 29 2013 → May 1 2013 |
All Science Journal Classification (ASJC) codes
- Control and Systems Engineering
- Software
- Statistics and Probability
- Artificial Intelligence