TY - GEN
T1 - Visual methods for examining SVM classifiers
AU - Caragea, Doina
AU - Cook, Dianne
AU - Wickham, Hadley
AU - Honavar, Vasant
N1 - Copyright:
Copyright 2008 Elsevier B.V., All rights reserved.
PY - 2008
Y1 - 2008
N2 - Support vector machines (SVM) offer a theoretically wellfounded approach to automated learning of pattern classifiers. They have been proven to give highly accurate results in complex classification problems, for example, gene expression analysis. The SVM algorithm is also quite intuitive with a few inputs to vary in the fitting process and several outputs that are interesting to study. For many data mining tasks (e.g., cancer prediction) finding classifiers with good predictive accuracy is important, but understanding the classifier is equally important. By studying the classifier outputs we may be able to produce a simpler classifier, learn which variables are the important discriminators between classes, and find the samples that are problematic to the classification. Visual methods for exploratory data analysis can help us to study the outputs and complement automated classification algorithms in data mining. We present the use of tour-based methods to plot aspects of the SVM classifier. This approach provides insights about the cluster structure in the data, the nature of boundaries between clusters, and problematic outliers. Furthermore, tours can be used to assess the variable importance. We show how visual methods can be used as a complement to crossvalidation methods in order to find good SVM input parameters for a particular data set.
AB - Support vector machines (SVM) offer a theoretically wellfounded approach to automated learning of pattern classifiers. They have been proven to give highly accurate results in complex classification problems, for example, gene expression analysis. The SVM algorithm is also quite intuitive with a few inputs to vary in the fitting process and several outputs that are interesting to study. For many data mining tasks (e.g., cancer prediction) finding classifiers with good predictive accuracy is important, but understanding the classifier is equally important. By studying the classifier outputs we may be able to produce a simpler classifier, learn which variables are the important discriminators between classes, and find the samples that are problematic to the classification. Visual methods for exploratory data analysis can help us to study the outputs and complement automated classification algorithms in data mining. We present the use of tour-based methods to plot aspects of the SVM classifier. This approach provides insights about the cluster structure in the data, the nature of boundaries between clusters, and problematic outliers. Furthermore, tours can be used to assess the variable importance. We show how visual methods can be used as a complement to crossvalidation methods in order to find good SVM input parameters for a particular data set.
UR - http://www.scopus.com/inward/record.url?scp=50149121547&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=50149121547&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-71080-6_10
DO - 10.1007/978-3-540-71080-6_10
M3 - Conference contribution
AN - SCOPUS:50149121547
SN - 3540710795
SN - 9783540710790
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 136
EP - 153
BT - Visual Data Mining - Theory, Techniques and Tools for Visual Analytics
A2 - Simoff, Simeon J.
A2 - Bohlen, Michael H.
A2 - Mazeika, Arturas
ER -