Generation of attribute value taxonomies from data for data-driven construction of accurate and compact classifiers

Dae Ki Kang, Adrian Silvescu, Jun Zhang, Vasant Honavar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Scopus citations

Abstract

Attribute Value Taxonomies (AVT) have been shown to be useful in constructing compact, robust, and comprehensible classifiers. However, in many application domains, human-designed AVTs are unavailable. We introduce AVT-Learner, an algorithm for automated construction of attribute value taxonomies from data. AVT-Learner uses Hierarchical Agglomerative Clustering (HAC) to cluster attribute values based on the distribution of classes that cooccur with the values. We describe experiments on UCI data sets that compare the performance of AVT-NBL (an AVT-guided Naive Bayes Learner) with that of the standard Naive Bayes Learner (NBL) applied to the original data set. Our results show that the AVTs generated by AVT-Learner are competitive with human-generated AVTs (in cases where such AVTs are available). AVT-NBL using AVTs generated by AVT-Learner achieves classification accuracies that are comparable to or higher than those obtained by NBL; and the resulting classifiers are significantly more compact than those generated by NBL.

Original languageEnglish (US)
Title of host publicationProceedings - Fourth IEEE International Conference on Data Mining, ICDM 2004
EditorsR. Rastogi, K. Morik, M. Bramer, X. Wu
Pages130-137
Number of pages8
StatePublished - 2004
EventProceedings - Fourth IEEE International Conference on Data Mining, ICDM 2004 - Brighton, United Kingdom
Duration: Nov 1 2004Nov 4 2004

Publication series

NameProceedings - Fourth IEEE International Conference on Data Mining, ICDM 2004

Other

OtherProceedings - Fourth IEEE International Conference on Data Mining, ICDM 2004
Country/TerritoryUnited Kingdom
CityBrighton
Period11/1/0411/4/04

All Science Journal Classification (ASJC) codes

  • General Engineering

Fingerprint

Dive into the research topics of 'Generation of attribute value taxonomies from data for data-driven construction of accurate and compact classifiers'. Together they form a unique fingerprint.

Cite this