TY - GEN
T1 - Semisupervised mixture modeling with fine-grained component-conditional class labeling and transductive inference
AU - Miller, David Jonathan
AU - Lin, Chu Fang
AU - Kesidis, George
AU - Collins, Christopher M.
PY - 2009
Y1 - 2009
N2 - This paper introduces a new generative semisupervised (transductive) mixture model with a more fine-grained class label generation mechanism than that of previous works. Our approach effectively combines the advantages of standard semisupervised mixtures, which achieve label extrapolation over a mixture component when there are few labeled samples, and nearest-neighbor (NN) classification, which achieves accurate classification in the local vicinity of labeled samples. Toward this end, we propose a two-stage stochastic data generation mechanism, with the unlabeled samples first produced and then the labeled samples generated conditioned on both the unlabeled data and on their components of origin. This nested data generation entails a more complicated (albeit still closed-form) E-step evaluation than that for standard mixtures. Our model is advantageous, compared with previous semisupervised mixtures, when mixture components model data from more than one class and when within-component class proportions are not constant over the feature space region "owned" by a component. Experiments demonstrate gains in classification accuracy over both the previous semisupervised mixture of experts model and over K-NN classification on data sets from the DC Irvine Repository.
AB - This paper introduces a new generative semisupervised (transductive) mixture model with a more fine-grained class label generation mechanism than that of previous works. Our approach effectively combines the advantages of standard semisupervised mixtures, which achieve label extrapolation over a mixture component when there are few labeled samples, and nearest-neighbor (NN) classification, which achieves accurate classification in the local vicinity of labeled samples. Toward this end, we propose a two-stage stochastic data generation mechanism, with the unlabeled samples first produced and then the labeled samples generated conditioned on both the unlabeled data and on their components of origin. This nested data generation entails a more complicated (albeit still closed-form) E-step evaluation than that for standard mixtures. Our model is advantageous, compared with previous semisupervised mixtures, when mixture components model data from more than one class and when within-component class proportions are not constant over the feature space region "owned" by a component. Experiments demonstrate gains in classification accuracy over both the previous semisupervised mixture of experts model and over K-NN classification on data sets from the DC Irvine Repository.
UR - http://www.scopus.com/inward/record.url?scp=77950961037&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77950961037&partnerID=8YFLogxK
U2 - 10.1109/MLSP.2009.5306229
DO - 10.1109/MLSP.2009.5306229
M3 - Conference contribution
AN - SCOPUS:77950961037
SN - 9781424449484
T3 - Machine Learning for Signal Processing XIX - Proceedings of the 2009 IEEE Signal Processing Society Workshop, MLSP 2009
BT - Machine Learning for Signal Processing XIX - Proceedings of the 2009 IEEE Signal Processing Society Workshop, MLSP 2009
T2 - Machine Learning for Signal Processing XIX - 2009 IEEE Signal Processing Society Workshop, MLSP 2009
Y2 - 2 September 2009 through 4 September 2009
ER -