TY - JOUR
T1 - MIP-BOOST
T2 - Efficient and Effective L 0 Feature Selection for Linear Regression
AU - Kenney, Ana
AU - Chiaromonte, Francesca
AU - Felici, Giovanni
N1 - Funding Information:
This work was partially funded by the NIH B2D2K training grant and the Huck Institutes of the Life Sciences of Penn State, and by NSF grant DMS-1407639. Computation was performed on the Roar Supercomputer at Penn State University. We thank Matthew Reimherr for useful discussions and comments.
Publisher Copyright:
© 2020 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.
PY - 2021
Y1 - 2021
N2 - Recent advances in mathematical programming have made mixed integer optimization a competitive alternative to popular regularization methods for selecting features in regression problems. The approach exhibits unquestionable foundational appeal and versatility, but also poses important challenges. Here, we propose MIP-BOOST, a revision of standard mixed integer programming feature selection that reduces the computational burden of tuning the critical sparsity bound parameter and improves performance in the presence of feature collinearity and of signals that vary in nature and strength. The final outcome is a more efficient and effective L 0 feature selection method for applications of realistic size and complexity, grounded on rigorous cross-validation tuning and exact optimization of the associated mixed integer program. Computational viability and improved performance in realistic scenarios is achieved through three independent but synergistic proposals. Supplementary materials including additional results, pseudocode, and computer code are available online.
AB - Recent advances in mathematical programming have made mixed integer optimization a competitive alternative to popular regularization methods for selecting features in regression problems. The approach exhibits unquestionable foundational appeal and versatility, but also poses important challenges. Here, we propose MIP-BOOST, a revision of standard mixed integer programming feature selection that reduces the computational burden of tuning the critical sparsity bound parameter and improves performance in the presence of feature collinearity and of signals that vary in nature and strength. The final outcome is a more efficient and effective L 0 feature selection method for applications of realistic size and complexity, grounded on rigorous cross-validation tuning and exact optimization of the associated mixed integer program. Computational viability and improved performance in realistic scenarios is achieved through three independent but synergistic proposals. Supplementary materials including additional results, pseudocode, and computer code are available online.
UR - http://www.scopus.com/inward/record.url?scp=85098703715&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098703715&partnerID=8YFLogxK
U2 - 10.1080/10618600.2020.1845184
DO - 10.1080/10618600.2020.1845184
M3 - Article
AN - SCOPUS:85098703715
SN - 1061-8600
VL - 30
SP - 566
EP - 577
JO - Journal of Computational and Graphical Statistics
JF - Journal of Computational and Graphical Statistics
IS - 3
ER -