Model-Free Feature Screening and FDR Control With Knockoff Features

Wanjun Liu, Yuan Ke, Jingyuan Liu, Runze Li

Research output: Contribution to journalArticlepeer-review

29 Scopus citations

Abstract

This article proposes a model-free and data-adaptive feature screening method for ultrahigh-dimensional data. The proposed method is based on the projection correlation which measures the dependence between two random vectors. This projection correlation based method does not require specifying a regression model, and applies to data in the presence of heavy tails and multivariate responses. It enjoys both sure screening and rank consistency properties under weak assumptions. A two-step approach, with the help of knockoff features, is advocated to specify the threshold for feature screening such that the false discovery rate (FDR) is controlled under a prespecified level. The proposed two-step approach enjoys both sure screening and FDR control simultaneously if the prespecified FDR level is greater or equal to 1/s, where s is the number of active features. The superior empirical performance of the proposed method is illustrated by simulation examples and real data applications. Supplementary materials for this article are available online.

Original languageEnglish (US)
Pages (from-to)428-443
Number of pages16
JournalJournal of the American Statistical Association
Volume117
Issue number537
DOIs
StatePublished - 2022

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Model-Free Feature Screening and FDR Control With Knockoff Features'. Together they form a unique fingerprint.

Cite this