Several authors have addressed learning a classifier given a mixed labeled/unlabeled training set. These works assume each unlabeled sample originates from one of the (known) classes. Here, we consider the scenario in which unlabeled points may belong either to known/predeAned or to heretofore undiscovered classes. There are several practical situations where such data may arise. We earlier proposed a novel statistical mixture model to flt this mixed data. Here we review this method and also introduce an alternative model. Our fundamental strategy is to view as observed data not only the feature vector and the class label, but also the fact of label presence/ahsence for each point. Two types of mixture components are posited to explain label presence/absence. "Predefined" components generate both labeled and unlabeled points and assume labels are missing at random. These components represent the known classes. "Non-predeAned" components only generate unlabeled points-thus, in localized regions, they capture data subsets that are ezclusively unlabeled. Such subsets may represent an outlier distribution, or new classes. The components' predeflnedlnonpredefined natures are data-driven, learned along with the other parameters via an algorithm based on expectation-maximization (EM). There are three natural applications: 1) robust classifier design, given a mixed training set with outliers; 2) classiflcation with rejections; 3) identitication of the unlabeled points (and their representative components) that originate from unknown classes, i.e. new class discovery. The effectiveness of our models in discovering purely unlabeled data components (potential new classes) is evaluated both on synthetic and real data sets. Although each of our models has its own advantages, our original model is found to achieve the best class discovery results.