TY - CONF
T1 - Self-discrepancy conditional independence test
AU - Lee, Sanghack
AU - Honavar, Vasant
N1 - Funding Information:
The authors are grateful to UAI 2017 anonymous reviewers for their thorough reviews. This research was supported by the Edward Frymoyer Endowed Professorship, the Center for Big Data Analytics and Discovery Informatics at the Pennsylvania State University, and the Sudha Murty Distinguished Visiting Chair in Neurocomputing and Data Science at the Indian Institute of Science.
PY - 2017
Y1 - 2017
N2 - Tests of conditional independence (CI) of random variables play an important role in machine learning and causal inference. Of particular interest are kernel-based CI tests which allow us to test for independence among random variables with complex distribution functions. The efficacy of a CI test is measured in terms of its power and its calibratedness. We show that the Kernel CI Permutation Test (KCIPT) suffers from a loss of calibratedness as its power is increased by increasing the number of bootstraps. To address this limitation, we propose a novel CI test, called Self- Discrepancy Conditional Independence Test (SDCIT). SDCIT uses a test statistic that is a modified unbiased estimate of maximum mean discrepancy (MMD), the largest difference in the means of features of the given sample and its permuted counterpart in the kernel-induced Hilbert space. We present results of experiments that demonstrate SDCIT is, relative to the other methods: (i) competitive in terms of its power and calibratedness, outperforming other methods when the number of conditioning variables is large; (ii) more robust with respect to the choice of the kernel function; and (iii) competitive in run time.
AB - Tests of conditional independence (CI) of random variables play an important role in machine learning and causal inference. Of particular interest are kernel-based CI tests which allow us to test for independence among random variables with complex distribution functions. The efficacy of a CI test is measured in terms of its power and its calibratedness. We show that the Kernel CI Permutation Test (KCIPT) suffers from a loss of calibratedness as its power is increased by increasing the number of bootstraps. To address this limitation, we propose a novel CI test, called Self- Discrepancy Conditional Independence Test (SDCIT). SDCIT uses a test statistic that is a modified unbiased estimate of maximum mean discrepancy (MMD), the largest difference in the means of features of the given sample and its permuted counterpart in the kernel-induced Hilbert space. We present results of experiments that demonstrate SDCIT is, relative to the other methods: (i) competitive in terms of its power and calibratedness, outperforming other methods when the number of conditioning variables is large; (ii) more robust with respect to the choice of the kernel function; and (iii) competitive in run time.
UR - http://www.scopus.com/inward/record.url?scp=85031121897&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85031121897&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85031121897
T2 - 33rd Conference on Uncertainty in Artificial Intelligence, UAI 2017
Y2 - 11 August 2017 through 15 August 2017
ER -