TY - GEN
T1 - A mixture of experts classifier with learning based on both labelled and unlabelled data
AU - Miller, David J.
AU - Uyar, Hasan S.
PY - 1997
Y1 - 1997
N2 - We address statistical classifier design given a mixed training set consisting of a small labelled feature set and a (generally larger) set of unlabel led feat u res. This situation arises, e.g., for medical images, where although training features may be plentiful, expensive expertise is required to extract their class labels. We propose a classifier structure and learning algorithm that make effective use of unlabelled data to improve performance. The learning is based on maximization of the total data likelihood, i.e. over both the labelled and unlabelled data subsets. Two distinct EM learning algorithms are proposed, differing in the EM formalism applied for unlabelled data. The classifier, based on a joint probability model for features and labels, is a "mixture of experts" structure that is equivalent to the radial basis function (RBP) classifier, but unlike RBFs, is amenable to likelihood-based training. The scope of application for the new method is greatly extended by the observation that test data, or any new data to classify, is in fact additional, unlabelled data-thus, a combined learning/classification operation-much akin to what is done in image segmentation-can be invoked whenever there is new data to classify. Experiments with data sets from the UC Irvine database demonstrate that the new learning algorithms and structure achieve substantial performance gains over alternative approaches.
AB - We address statistical classifier design given a mixed training set consisting of a small labelled feature set and a (generally larger) set of unlabel led feat u res. This situation arises, e.g., for medical images, where although training features may be plentiful, expensive expertise is required to extract their class labels. We propose a classifier structure and learning algorithm that make effective use of unlabelled data to improve performance. The learning is based on maximization of the total data likelihood, i.e. over both the labelled and unlabelled data subsets. Two distinct EM learning algorithms are proposed, differing in the EM formalism applied for unlabelled data. The classifier, based on a joint probability model for features and labels, is a "mixture of experts" structure that is equivalent to the radial basis function (RBP) classifier, but unlike RBFs, is amenable to likelihood-based training. The scope of application for the new method is greatly extended by the observation that test data, or any new data to classify, is in fact additional, unlabelled data-thus, a combined learning/classification operation-much akin to what is done in image segmentation-can be invoked whenever there is new data to classify. Experiments with data sets from the UC Irvine database demonstrate that the new learning algorithms and structure achieve substantial performance gains over alternative approaches.
UR - http://www.scopus.com/inward/record.url?scp=84898980291&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84898980291&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84898980291
SN - 0262100657
SN - 9780262100656
T3 - Advances in Neural Information Processing Systems
SP - 571
EP - 577
BT - Advances in Neural Information Processing Systems 9 - Proceedings of the 1996 Conference, NIPS 1996
PB - Neural information processing systems foundation
T2 - 10th Annual Conference on Neural Information Processing Systems, NIPS 1996
Y2 - 2 December 1996 through 5 December 1996
ER -