TY - JOUR
T1 - Simultaneous variable selection and smoothing for high-dimensional function-on-scalar regression
AU - Parodi, Alice
AU - Reimherr, Matthew
N1 - Funding Information:
Supported by NSF DMS 1712826 and NIDA P50 DA039838.
Publisher Copyright:
© 2018, Institute of Mathematical Statistics. All rights reserved.
PY - 2018
Y1 - 2018
N2 - We present a new methodology, called FLAME, which simultaneously selects important predictors and produces smooth estimates in a function-on-scalar linear model with a large number of scalar predictors. Our framework applies quite generally by viewing the functional outcomes as elements of an arbitrary real separable Hilbert space. To select important predictors while also producing smooth parameter estimates, we utilize operators to define subspaces that are imbued with certain desirable properties as determined by the practitioner and the setting, such as smoothness or periodicity. In special cases one can show that these subspaces correspond to Reproducing Kernel Hilbert Spaces, however our methodology applies more broadly. We provide a very fast algorithm for computing the estimators, which is based on a functional coordinate descent, and an R package, flm, whose backend is written in C++. Asymptotic properties of the estimators are developed and simulations are provided to illustrate the advantages of FLAME over existing methods, both in terms of statistical performance and computational efficiency. We conclude with an application to childhood asthma, where we find a potentially important genetic mutation that was not selected by previous functional data based methods.
AB - We present a new methodology, called FLAME, which simultaneously selects important predictors and produces smooth estimates in a function-on-scalar linear model with a large number of scalar predictors. Our framework applies quite generally by viewing the functional outcomes as elements of an arbitrary real separable Hilbert space. To select important predictors while also producing smooth parameter estimates, we utilize operators to define subspaces that are imbued with certain desirable properties as determined by the practitioner and the setting, such as smoothness or periodicity. In special cases one can show that these subspaces correspond to Reproducing Kernel Hilbert Spaces, however our methodology applies more broadly. We provide a very fast algorithm for computing the estimators, which is based on a functional coordinate descent, and an R package, flm, whose backend is written in C++. Asymptotic properties of the estimators are developed and simulations are provided to illustrate the advantages of FLAME over existing methods, both in terms of statistical performance and computational efficiency. We conclude with an application to childhood asthma, where we find a potentially important genetic mutation that was not selected by previous functional data based methods.
UR - http://www.scopus.com/inward/record.url?scp=85063380011&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85063380011&partnerID=8YFLogxK
U2 - 10.1214/18-EJS1509
DO - 10.1214/18-EJS1509
M3 - Article
AN - SCOPUS:85063380011
SN - 1935-7524
VL - 12
SP - 4602
EP - 4639
JO - Electronic Journal of Statistics
JF - Electronic Journal of Statistics
IS - 2
ER -