TY - JOUR
T1 - General statistical inference for discrete and mixed spaces by an approximate application of the maximum entropy principle
AU - Yan, Lian
AU - Miller, David J.
N1 - Funding Information:
Manuscript received June 1, 1999; revised January 13, 2000. This work was supported in part by the National Science Foundation under Career Award IIS-9624870.
PY - 2000/5
Y1 - 2000/5
N2 - We propose a new method for learning a general statistical inference engine, operating on discrete and mixed discrete/continuous feature spaces. Such a model allows inference on any of the discrete features, given values for the remaining features. Applications are, e.g., to medical diagnosis with multiple possible diseases, fault diagnosis, information retrieval, and imputation in databases. Bayesian networks (BN's) are versatile tools that possess this inference capability. However, BN's require explicit specification of conditional independences, which may be difficult to assess given limited data. Alternatively, Cheeseman proposed finding the maximum entropy (ME) joint probability mass function (pmf) consistent with arbitrary lower order probability constraints. This approach is in principle powerful and does not require explicit expression of conditional independence. However, until now, the huge learning complexity has severely limited the use of this approach. Here we propose an approximate ME method, which also encodes arbitrary low-order constraints but while retaining quite tractable learning. Our method uses a restriction of joint pmf support (during learning) to a subset of the feature space. Results on the University of California-Irvine repository reveal performance gains over several BN approaches and over multilayer perceptrons.
AB - We propose a new method for learning a general statistical inference engine, operating on discrete and mixed discrete/continuous feature spaces. Such a model allows inference on any of the discrete features, given values for the remaining features. Applications are, e.g., to medical diagnosis with multiple possible diseases, fault diagnosis, information retrieval, and imputation in databases. Bayesian networks (BN's) are versatile tools that possess this inference capability. However, BN's require explicit specification of conditional independences, which may be difficult to assess given limited data. Alternatively, Cheeseman proposed finding the maximum entropy (ME) joint probability mass function (pmf) consistent with arbitrary lower order probability constraints. This approach is in principle powerful and does not require explicit expression of conditional independence. However, until now, the huge learning complexity has severely limited the use of this approach. Here we propose an approximate ME method, which also encodes arbitrary low-order constraints but while retaining quite tractable learning. Our method uses a restriction of joint pmf support (during learning) to a subset of the feature space. Results on the University of California-Irvine repository reveal performance gains over several BN approaches and over multilayer perceptrons.
UR - http://www.scopus.com/inward/record.url?scp=0034187697&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0034187697&partnerID=8YFLogxK
U2 - 10.1109/72.846727
DO - 10.1109/72.846727
M3 - Article
C2 - 18249785
AN - SCOPUS:0034187697
SN - 1045-9227
VL - 11
SP - 558
EP - 573
JO - IEEE Transactions on Neural Networks
JF - IEEE Transactions on Neural Networks
IS - 3
ER -