We consider a feature selection problem where the decision-making objective is to minimize overall misclassification cost by selecting relevant features from a training dataset. We propose a two-stage solution approach for solving misclassification cost minimizing feature selection (MCMFS) problem. Additionally, we propose a maximum-margin genetic algorithm (MMGA) that maximizes margin of separation between classes by taking into account all examples as opposed to maximizing margin of separation using a few support vectors. Feature selection is carried out by either an exhaustive or a heuristic simulated annealing approach in the first stage and a cost sensitive classification using either MMGA or cost sensitive support vector machines (SVM) in the second stage. Using simulated and real-world data sets and different misclassification cost matrices, we test our two-stage approach for solving the MCMFS problem. Our results indicate that feature selection plays an important role when misclassification cost asymmetries increase and the MMGA shows equal or better performance than the SVM.
All Science Journal Classification (ASJC) codes
- General Engineering
- Computer Science Applications
- Artificial Intelligence