A global optimization technique for statistical classifier design

David Miller, Ajit V. Rao, Kenneth Rose, Allen Gersho

Research output: Contribution to journalArticlepeer-review

70 Scopus citations

Abstract

A global optimization method is introduced for the design of statistical classifiers that minimize the rate of misclassification. We first derive the theoretical basis for the method on which we base the development of a novel design algorithm and demonstrate its effectiveness and superior performance in the design of practical classifiers for some of the most popular structures currently in use. The method grounded in ideas from statistical physics and information theory extends the deterministic annealing approach for optimization both to incorporate structural constraints on data assignments to classes and to minimize the probability of error as the cost objective. During the design data are assigned to classes in probability so as to minimize the expected classification error given a specified level of randomness as measured by Shannon's entropy. The constrained optimization is equivalent to a free-energy minimization motivating a deterministic annealing approach in which the entropy and expected misclassification cost are reduced with the temperature while enforcing the classifier's structure. In the limit a hard classifier is obtained. This approach is applicable to a variety of classifier structures including the widely used prototype-based radial basis function and multilayer perceptron classifiers. The method is compared with learning vector quantization back propagation (BP) several radial basis function design techniques as well as with paradigms for more directly optimizing all these structures to minimize probability of error. The annealing method achieves significant performance gains over other design methods on a number of benchmark examples from the literature while often retaining design complexity comparable with or only moderately greater than that of strict descent methods. Substantial gains both inside and outside the training set are achieved for complicated examples involving high-dimensional data and large class overlap.

Original languageEnglish (US)
Pages (from-to)3108-3122
Number of pages15
JournalIEEE Transactions on Signal Processing
Volume44
Issue number12
DOIs
StatePublished - 1996

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A global optimization technique for statistical classifier design'. Together they form a unique fingerprint.

Cite this