TY - JOUR
T1 - ROBUST ACCELERATED PRIMAL-DUAL METHODS FOR COMPUTING SADDLE POINTS
AU - Zhang, Xuan
AU - Aybat, Necdet Serhat
AU - Gürbüzbalaban, Mert
N1 - Publisher Copyright:
© 2024 Society for Industrial and Applied Mathematics.
PY - 2024
Y1 - 2024
N2 - We consider strongly-convex-strongly-concave saddle point problems assuming we have access to unbiased stochastic estimates of the gradients. We propose a stochastic accelerated primal-dual (SAPD) algorithm and show that the SAPD sequence, generated using constant primal-dual step sizes, linearly converges to a neighborhood of the unique saddle point. Interpreting the size of the neighborhood as a measure of robustness to gradient noise, we obtain explicit characterizations of robustness in terms of SAPD parameters and problem constants. Based on these characterizations, we develop computationally tractable techniques for optimizing the SAPD parameters, i.e., the primal and dual step sizes, and the momentum parameter, to achieve a desired trade-off between the convergence rate and robustness on the Pareto curve. This allows SAPD to enjoy fast convergence properties while being robust to noise as an accelerated method. SAPD admits convergence guarantees for the distance metric with a variance term optimal up to a logarithmic factor, which can be removed by employing a restarting strategy. We also discuss how convergence and robustness results extend to the merely-convex-merely-concave setting. Finally, we illustrate our framework on a distributionally robust logistic regression problem.
AB - We consider strongly-convex-strongly-concave saddle point problems assuming we have access to unbiased stochastic estimates of the gradients. We propose a stochastic accelerated primal-dual (SAPD) algorithm and show that the SAPD sequence, generated using constant primal-dual step sizes, linearly converges to a neighborhood of the unique saddle point. Interpreting the size of the neighborhood as a measure of robustness to gradient noise, we obtain explicit characterizations of robustness in terms of SAPD parameters and problem constants. Based on these characterizations, we develop computationally tractable techniques for optimizing the SAPD parameters, i.e., the primal and dual step sizes, and the momentum parameter, to achieve a desired trade-off between the convergence rate and robustness on the Pareto curve. This allows SAPD to enjoy fast convergence properties while being robust to noise as an accelerated method. SAPD admits convergence guarantees for the distance metric with a variance term optimal up to a logarithmic factor, which can be removed by employing a restarting strategy. We also discuss how convergence and robustness results extend to the merely-convex-merely-concave setting. Finally, we illustrate our framework on a distributionally robust logistic regression problem.
UR - http://www.scopus.com/inward/record.url?scp=85195236282&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85195236282&partnerID=8YFLogxK
U2 - 10.1137/21M1462775
DO - 10.1137/21M1462775
M3 - Article
AN - SCOPUS:85195236282
SN - 1052-6234
VL - 34
SP - 1097
EP - 1130
JO - SIAM Journal on Optimization
JF - SIAM Journal on Optimization
IS - 1
ER -