TY - JOUR
T1 - Model-Free Feature Screening for Ultrahigh Dimensional Discriminant Analysis
AU - Cui, Hengjian
AU - Li, Runze
AU - Zhong, Wei
N1 - Publisher Copyright:
© 2015 American Statistical Association.
PY - 2015/4/3
Y1 - 2015/4/3
N2 - This work is concerned with marginal sure independence feature screening for ultrahigh dimensional discriminant analysis. The response variable is categorical in discriminant analysis. This enables us to use the conditional distribution function to construct a new index for feature screening. In this article, we propose a marginal feature screening procedure based on empirical conditional distribution function. We establish the sure screening and ranking consistency properties for the proposed procedure without assuming any moment condition on the predictors. The proposed procedure enjoys several appealing merits. First, it is model-free in that its implementation does not require specification of a regression model. Second, it is robust to heavy-tailed distributions of predictors and the presence of potential outliers. Third, it allows the categorical response having a diverging number of classes in the order of O(nκ) with some κ ⩾ 0. We assess the finite sample property of the proposed procedure by Monte Carlo simulation studies and numerical comparison. We further illustrate the proposed methodology by empirical analyses of two real-life datasets. Supplementary materials for this article are available online.
AB - This work is concerned with marginal sure independence feature screening for ultrahigh dimensional discriminant analysis. The response variable is categorical in discriminant analysis. This enables us to use the conditional distribution function to construct a new index for feature screening. In this article, we propose a marginal feature screening procedure based on empirical conditional distribution function. We establish the sure screening and ranking consistency properties for the proposed procedure without assuming any moment condition on the predictors. The proposed procedure enjoys several appealing merits. First, it is model-free in that its implementation does not require specification of a regression model. Second, it is robust to heavy-tailed distributions of predictors and the presence of potential outliers. Third, it allows the categorical response having a diverging number of classes in the order of O(nκ) with some κ ⩾ 0. We assess the finite sample property of the proposed procedure by Monte Carlo simulation studies and numerical comparison. We further illustrate the proposed methodology by empirical analyses of two real-life datasets. Supplementary materials for this article are available online.
UR - http://www.scopus.com/inward/record.url?scp=84936817500&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84936817500&partnerID=8YFLogxK
U2 - 10.1080/01621459.2014.920256
DO - 10.1080/01621459.2014.920256
M3 - Article
AN - SCOPUS:84936817500
SN - 0162-1459
VL - 110
SP - 630
EP - 641
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 510
ER -