Abstract
Conditional probability tables (CPT) in many Bayesian networks often contain missing values. The problem of missing values in CPT is a very common problem and occurs due to the lack of data on certain scenarios that are observed in the real world but are missing in the training data. The current approaches of addressing the problem of missing values in CPT are very restrictive in that they assume certain probability distributions for estimating missing values. Recently, maximum entropy (ME) approaches have been used to learn features of probability distribution functions from the observed data. The ME approaches do not require any data distribution assumptions and are shown to work well for several non-parametric distributions. The ME and least square (LS) error minimizing approaches can be used for estimating missing values in CPT for Bayesian networks. The applications of ME and LS approaches for estimating missing CPT require researchers to solve difficult constrained non-linear optimization problems. These difficult constrained non-linear optimization problems can be solved using genetic algorithms.
Original language | English (US) |
---|---|
Pages (from-to) | 3583-3602 |
Number of pages | 20 |
Journal | Computational Statistics and Data Analysis |
Volume | 52 |
Issue number | 7 |
DOIs | |
State | Published - Mar 15 2008 |
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Computational Mathematics
- Computational Theory and Mathematics
- Applied Mathematics