On classification with incomplete data

David Williams, Xuejun Liao, Ya Xue, Lawrence Carin, Balaji Krishnapuram

    Research output: Contribution to journalArticlepeer-review

    98 Scopus citations

    Abstract

    We address the incomplete-data problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing analytic integration with an estimated conditional density function (conditioned on the observed data). Conditional density functions are estimated using a Gaussian mixture model (GMM), with parameter estimation performed using both Expectation-Maximization (EM) and Variational Bayesian EM (VB-EM). The proposed supervised algorithm is then extended to the semisupervised case by incorporating graph-based regularization. The semisupervised algorithm utilizes all available data - both incomplete and complete, as well as labeled and unlabeled. Experimental results of the proposed classification algorithms are shown.

    Original languageEnglish (US)
    Pages (from-to)427-436
    Number of pages10
    JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
    Volume29
    Issue number3
    DOIs
    StatePublished - Mar 2007

    All Science Journal Classification (ASJC) codes

    • Software
    • Computer Vision and Pattern Recognition
    • Computational Theory and Mathematics
    • Artificial Intelligence
    • Applied Mathematics

    Fingerprint

    Dive into the research topics of 'On classification with incomplete data'. Together they form a unique fingerprint.

    Cite this