Approximate maximum entropy joint feature inference for discrete space classification

David J. Miller, Lian Yan

Research output: Contribution to conferencePaperpeer-review

1 Scopus citations


We propose a new method for learning discrete space statistical classifiers. Similar to [4] and [2] we cast classification/inference within the more general framework of estimating the joint pmf for the (feature vector, class label) pair. The proposal of Cheeseman to construct the maximum entropy (ME) joint pmf consistent with general lower order probability constraints is in principle powerful, allowing for general dependencies between features. However, this approach has been severely limited by its huge learning complexity. Alternatives such as Bayesian networks (BNs) require explicit specification of conditional independencies. These may be difficult to assess given finite data. Also, BN learning approaches are typically greedy and must also address a difficult model order selection problem to avoid overfitting. Here we reconsider the ME problem, proposing an approximate method which encodes arbitrary low order constraints, but while retaining quite tractable learning. The new method approximates the joint feature pmf during learning on a subgrid of the full feature space. Results on the UC-Irvine repository reveal performance gains over [4], over the BN Kutato, and also over MLPs. Extensions to more general inference problems are indicated.

Original languageEnglish (US)
Number of pages6
StatePublished - 1999
EventInternational Joint Conference on Neural Networks (IJCNN'99) - Washington, DC, USA
Duration: Jul 10 1999Jul 16 1999


OtherInternational Joint Conference on Neural Networks (IJCNN'99)
CityWashington, DC, USA

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence


Dive into the research topics of 'Approximate maximum entropy joint feature inference for discrete space classification'. Together they form a unique fingerprint.

Cite this