TY - JOUR
T1 - Nonlinear sufficient dimension reduction for distribution-on-distribution regression
AU - Zhang, Qi
AU - Li, Bing
AU - Xue, Lingzhou
N1 - Publisher Copyright:
© 2024 Elsevier Inc.
PY - 2024/7
Y1 - 2024/7
N2 - We introduce a new approach to nonlinear sufficient dimension reduction in cases where both the predictor and the response are distributional data, modeled as members of a metric space. Our key step is to build universal kernels (cc-universal) on the metric spaces, which results in reproducing kernel Hilbert spaces for the predictor and response that are rich enough to characterize the conditional independence that determines sufficient dimension reduction. For univariate distributions, we construct the universal kernel using the Wasserstein distance, while for multivariate distributions, we resort to the sliced Wasserstein distance. The sliced Wasserstein distance ensures that the metric space possesses similar topological properties to the Wasserstein space, while also offering significant computation benefits. Numerical results based on synthetic data show that our method outperforms possible competing methods. The method is also applied to several data sets, including fertility and mortality data and Calgary temperature data.
AB - We introduce a new approach to nonlinear sufficient dimension reduction in cases where both the predictor and the response are distributional data, modeled as members of a metric space. Our key step is to build universal kernels (cc-universal) on the metric spaces, which results in reproducing kernel Hilbert spaces for the predictor and response that are rich enough to characterize the conditional independence that determines sufficient dimension reduction. For univariate distributions, we construct the universal kernel using the Wasserstein distance, while for multivariate distributions, we resort to the sliced Wasserstein distance. The sliced Wasserstein distance ensures that the metric space possesses similar topological properties to the Wasserstein space, while also offering significant computation benefits. Numerical results based on synthetic data show that our method outperforms possible competing methods. The method is also applied to several data sets, including fertility and mortality data and Calgary temperature data.
UR - http://www.scopus.com/inward/record.url?scp=85186221334&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85186221334&partnerID=8YFLogxK
U2 - 10.1016/j.jmva.2024.105302
DO - 10.1016/j.jmva.2024.105302
M3 - Article
C2 - 38525479
AN - SCOPUS:85186221334
SN - 0047-259X
VL - 202
JO - Journal of Multivariate Analysis
JF - Journal of Multivariate Analysis
M1 - 105302
ER -