TY - GEN
T1 - Scalable, efficient, stepwise-optimal feature elimination in support vector machines
AU - Aksu, Yaman
AU - Kesidis, George
AU - Miller, David J.
PY - 2007
Y1 - 2007
N2 - We address feature selection for support vector machines for the scenario in which the feature space is huge, i.e., 105-105 or more features, as may occur e.g. in a biomedical context working with 3-D (or 4-D) brain images. Feature selection in this case may be needed to improve the classifier's generalization performance (given limited training data), to reduce classification complexity, and/or to identify a minimum subset of features necessary for accurate classification, i.e., a set of putative "biomarkers". While there are a variety of techniques for SVM-based feature selection, many such may be unsuitable for huge feature spaces due to computational and/or memory requirements. One popular, lightweight scheme is recursive feature elimination (RFE) [5], wherein the feature with smallest weight magnitude in the current solution is eliminated at each step. Here we propose an alternative to RFE that is stepwise superior in that it maximizes margin (in the separable case) and minimizes training error rate (in the non-separable case), rather than minimizing weight magnitude. Moreover, we formulate an algorithm that achieves this stepwise maximum margin feature elimination without requiring explicit margin evaluation for all the remaining (candidate) features - in this way, the method achieves reduced complexity. To date, we have only performed experiments on (modestly dimensioned) UC Irvine data sets, which demonstrate better classification accuracy of our scheme (both training and test) over RFE. At the workshop, we will present results on huge feature spaces, for disease classification of 3-D MRI brain images and on other data domains.
AB - We address feature selection for support vector machines for the scenario in which the feature space is huge, i.e., 105-105 or more features, as may occur e.g. in a biomedical context working with 3-D (or 4-D) brain images. Feature selection in this case may be needed to improve the classifier's generalization performance (given limited training data), to reduce classification complexity, and/or to identify a minimum subset of features necessary for accurate classification, i.e., a set of putative "biomarkers". While there are a variety of techniques for SVM-based feature selection, many such may be unsuitable for huge feature spaces due to computational and/or memory requirements. One popular, lightweight scheme is recursive feature elimination (RFE) [5], wherein the feature with smallest weight magnitude in the current solution is eliminated at each step. Here we propose an alternative to RFE that is stepwise superior in that it maximizes margin (in the separable case) and minimizes training error rate (in the non-separable case), rather than minimizing weight magnitude. Moreover, we formulate an algorithm that achieves this stepwise maximum margin feature elimination without requiring explicit margin evaluation for all the remaining (candidate) features - in this way, the method achieves reduced complexity. To date, we have only performed experiments on (modestly dimensioned) UC Irvine data sets, which demonstrate better classification accuracy of our scheme (both training and test) over RFE. At the workshop, we will present results on huge feature spaces, for disease classification of 3-D MRI brain images and on other data domains.
UR - http://www.scopus.com/inward/record.url?scp=48149100965&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=48149100965&partnerID=8YFLogxK
U2 - 10.1109/MLSP.2007.4414285
DO - 10.1109/MLSP.2007.4414285
M3 - Conference contribution
AN - SCOPUS:48149100965
SN - 1424415667
SN - 9781424415663
T3 - Machine Learning for Signal Processing 17 - Proceedings of the 2007 IEEE Signal Processing Society Workshop, MLSP
SP - 75
EP - 80
BT - Machine Learning for Signal Processing 17 - Proceedings of the 2007 IEEE Signal Processing Society Workshop, MLSP
T2 - 17th IEEE International Workshop on Machine Learning for Signal Processing, MLSP-2007
Y2 - 27 August 2007 through 29 August 2007
ER -