TY - JOUR
T1 - Convergence guarantee for the sparse monotone single index model∗
AU - Dai, Ran
AU - Song, Hyebin
AU - Barber, Rina Foygel
AU - Raskutti, Garvesh
N1 - Funding Information:
∗R.F.B. was partially supported by the National Science Foundation via grants DMS-1654076 and DMS-2023109, and by the Office of Naval Research via grant N00014-20-1-2337. G.R. was partially supported by the National Science Foundation via grant DMS-1811767 and by the National Institute of Health via grant R01 GM131381-01. H. S. was partially supported by the National Institute of Health via grant R01 GM131381-01. †Corresponding author.
Publisher Copyright:
© 2022, Institute of Mathematical Statistics. All rights reserved.
PY - 2022
Y1 - 2022
N2 - We consider a high-dimensional monotone single index model (hdSIM), which is a semiparametric extension of a high-dimensional generalize linear model (hdGLM), where the link function is unknown, but constrained with monotone non-decreasing shape. We develop a scalable projection-based iterative approach, the “Sparse Orthogonal Descent SingleIndex Model” (SOD-SIM), which alternates between sparse-thresholded orthogonalized “gradient-like” steps and isotonic regression steps to recover the coefficient vector. Our main contribution is that we provide finite sample estimation bounds for both the coefficient vector and the link function in high-dimensional settings under very mild assumptions on the design matrix X, the error term ɛ, and their dependence. The convergence rate for the link function matches the low-dimensional isotonic regression minimax rate up to some poly-log terms (n−1/3 ). The convergence rate for the coefficients is also n−1/3 up to some poly-log terms. This method can be applied to many real data problems, including GLMs with mis-specified link, classification with mislabeled data, and classification with positive-unlabeled (PU) data. We study the performance of this method via both numerical studies and also an application on a PU data example.
AB - We consider a high-dimensional monotone single index model (hdSIM), which is a semiparametric extension of a high-dimensional generalize linear model (hdGLM), where the link function is unknown, but constrained with monotone non-decreasing shape. We develop a scalable projection-based iterative approach, the “Sparse Orthogonal Descent SingleIndex Model” (SOD-SIM), which alternates between sparse-thresholded orthogonalized “gradient-like” steps and isotonic regression steps to recover the coefficient vector. Our main contribution is that we provide finite sample estimation bounds for both the coefficient vector and the link function in high-dimensional settings under very mild assumptions on the design matrix X, the error term ɛ, and their dependence. The convergence rate for the link function matches the low-dimensional isotonic regression minimax rate up to some poly-log terms (n−1/3 ). The convergence rate for the coefficients is also n−1/3 up to some poly-log terms. This method can be applied to many real data problems, including GLMs with mis-specified link, classification with mislabeled data, and classification with positive-unlabeled (PU) data. We study the performance of this method via both numerical studies and also an application on a PU data example.
UR - http://www.scopus.com/inward/record.url?scp=85136230142&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85136230142&partnerID=8YFLogxK
U2 - 10.1214/22-EJS2046
DO - 10.1214/22-EJS2046
M3 - Article
AN - SCOPUS:85136230142
SN - 1935-7524
VL - 16
SP - 4449
EP - 4496
JO - Electronic Journal of Statistics
JF - Electronic Journal of Statistics
IS - 2
ER -