TY - GEN
T1 - Efficient Markov network structure discovery using independence tests
AU - Bromberg, Facundo
AU - Margaritis, Dimitris
AU - Honavar, Vasant
PY - 2006
Y1 - 2006
N2 - We present two algorithms for learning the structure of a Markov network from discrete data: GSMN and GSIMN. Both algorithms use statistical conditional independence tests on data to infer the structure by successively constraining the set of structures consistent with the results of these tests. GSMN is a natural adaptation of the Grow-Shrink algorithm of Margaritis and Thrun for learning the structure of Bayesian networks. GSIMN extends GSMN by additionally exploiting Pearl's well-known properties of conditional independence relations to infer novel independencies from known independencies, thus avoiding the need to perform these tests. Experiments on artificial and real data sets show GSIMN can yield savings of up to 70% with respect to GSMN, while generating a Markov network with comparable or in several cases considerably improved quality. In addition to GSMN, we also compare GSIMN to a forward-chaining implementation, called GSIMN-FCH, that produces all possible conditional independence results by repeatedly applying Pearl's theorems on the known conditional independence tests. The results of this comparison show that GSIMN is nearly optimal in terms of the number of tests it can infer, under a fixed ordering of the tests performed.
AB - We present two algorithms for learning the structure of a Markov network from discrete data: GSMN and GSIMN. Both algorithms use statistical conditional independence tests on data to infer the structure by successively constraining the set of structures consistent with the results of these tests. GSMN is a natural adaptation of the Grow-Shrink algorithm of Margaritis and Thrun for learning the structure of Bayesian networks. GSIMN extends GSMN by additionally exploiting Pearl's well-known properties of conditional independence relations to infer novel independencies from known independencies, thus avoiding the need to perform these tests. Experiments on artificial and real data sets show GSIMN can yield savings of up to 70% with respect to GSMN, while generating a Markov network with comparable or in several cases considerably improved quality. In addition to GSMN, we also compare GSIMN to a forward-chaining implementation, called GSIMN-FCH, that produces all possible conditional independence results by repeatedly applying Pearl's theorems on the known conditional independence tests. The results of this comparison show that GSIMN is nearly optimal in terms of the number of tests it can infer, under a fixed ordering of the tests performed.
UR - http://www.scopus.com/inward/record.url?scp=33745441891&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745441891&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972764.13
DO - 10.1137/1.9781611972764.13
M3 - Conference contribution
AN - SCOPUS:33745441891
SN - 089871611X
SN - 9780898716115
T3 - Proceedings of the Sixth SIAM International Conference on Data Mining
SP - 141
EP - 152
BT - Proceedings of the Sixth SIAM International Conference on Data Mining
PB - Society for Industrial and Applied Mathematics
T2 - Sixth SIAM International Conference on Data Mining
Y2 - 20 April 2006 through 22 April 2006
ER -