TY - JOUR
T1 - GROMOV–WASSERSTEIN DISTANCES
T2 - ENTROPIC REGULARIZATION, DUALITY AND SAMPLE COMPLEXITY
AU - Zhang, Zhengxin
AU - Goldfeld, Ziv
AU - Mroueh, Youssef
AU - Sriperumbudur, Bharath K.
N1 - Publisher Copyright:
© 2024 Institute of Mathematical Statistics. All rights reserved.
PY - 2024/8
Y1 - 2024/8
N2 - The Gromov–Wasserstein (GW) distance, rooted in optimal transport (OT) theory, quantifies dissimilarity between metric measure spaces and provides a framework for aligning heterogeneous datasets. While computational aspects of the GW problem have been widely studied, a duality theory and fundamental statistical questions concerning empirical convergence rates remained obscure. This work closes these gaps for the quadratic GW distance over Euclidean spaces of different dimensions dx and dy. We treat both the standard and the entropically regularized GW distance, and derive dual forms that represent them in terms of the well-understood OT and entropic OT (EOT) problems, respectively. This enables employing proof techniques from statistical OT based on regularity analysis of dual potentials and empirical process theory, using which we establish the first GW empirical convergence rates. The derived two-sample rates are n−2/max{min{dx,dy},4} (up to a log factor when min{dx,dy} = 4) for standard GW and n−1/2 for entropic GW (EGW), which matches the corresponding rates for standard and entropic OT. The parametric rate for EGW is evidently optimal, while for standard GW we provide matching lower bounds, which establish sharpness of the derived rates. We also study stability of EGW in the entropic regularization parameter and prove approximation and continuity results for the cost and optimal couplings. Lastly, the duality is leveraged to shed new light on the open problem of the one-dimensional GW distance between uniform distributions on n points, illuminating why the identity and anti-identity permutations may not be optimal. Our results serve as a first step towards a comprehensive statistical theory as well as computational advancements for GW distances, based on the discovered dual formulations.
AB - The Gromov–Wasserstein (GW) distance, rooted in optimal transport (OT) theory, quantifies dissimilarity between metric measure spaces and provides a framework for aligning heterogeneous datasets. While computational aspects of the GW problem have been widely studied, a duality theory and fundamental statistical questions concerning empirical convergence rates remained obscure. This work closes these gaps for the quadratic GW distance over Euclidean spaces of different dimensions dx and dy. We treat both the standard and the entropically regularized GW distance, and derive dual forms that represent them in terms of the well-understood OT and entropic OT (EOT) problems, respectively. This enables employing proof techniques from statistical OT based on regularity analysis of dual potentials and empirical process theory, using which we establish the first GW empirical convergence rates. The derived two-sample rates are n−2/max{min{dx,dy},4} (up to a log factor when min{dx,dy} = 4) for standard GW and n−1/2 for entropic GW (EGW), which matches the corresponding rates for standard and entropic OT. The parametric rate for EGW is evidently optimal, while for standard GW we provide matching lower bounds, which establish sharpness of the derived rates. We also study stability of EGW in the entropic regularization parameter and prove approximation and continuity results for the cost and optimal couplings. Lastly, the duality is leveraged to shed new light on the open problem of the one-dimensional GW distance between uniform distributions on n points, illuminating why the identity and anti-identity permutations may not be optimal. Our results serve as a first step towards a comprehensive statistical theory as well as computational advancements for GW distances, based on the discovered dual formulations.
UR - http://www.scopus.com/inward/record.url?scp=85201023666&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85201023666&partnerID=8YFLogxK
U2 - 10.1214/24-AOS2406
DO - 10.1214/24-AOS2406
M3 - Article
AN - SCOPUS:85201023666
SN - 0090-5364
VL - 52
SP - 1616
EP - 1645
JO - Annals of Statistics
JF - Annals of Statistics
IS - 4
ER -