TY - JOUR
T1 - DISTRIBUTED OPTIMIZATION BASED ON GRADIENT TRACKING REVISITED
T2 - ENHANCING CONVERGENCE RATE VIA SURROGATION
AU - Sun, Ying
AU - Scutari, Gesualdo
AU - Daneshmand, Amir
N1 - Publisher Copyright:
© 2022 Society for Industrial and Applied Mathematics
PY - 2022
Y1 - 2022
N2 - We study distributed multiagent optimization over graphs. We consider the minimization of F + G subject to convex constraints, where F is the smooth strongly convex sum of the agent's losses and G is a nonsmooth convex function. We build on the SONATA algorithm: the algorithm employs the use of surrogate objective functions in the agents' subproblems (thus going beyond linearization, such as proximal-gradient) coupled with a perturbed consensus mechanism that aims to locally track the gradient of F. SONATA achieves precision \epsilon > 0 on the objective value in \scrO (\kappa g log(1/\epsilon )) gradient computations at each node and \scrO \~\bigl(\kappa g(1 - \rho ) -1/2 log(1/\epsilon )\bigr) communication steps, where \kappa g is the condition number of F and \rho characterizes the connectivity of the network. This is the first linear rate result for distributed composite optimization; it also improves on existing (nonaccelerated) schemes just minimizing F, whose rate depends on much larger quantities than \kappa g. When the loss functions of the agents are similar, due to statistical data similarity or otherwise, SONATA employing high-order surrogates achieves precision \epsilon > 0 in \scrO \bigl((\beta /\mu ) log(1/\epsilon )\bigr) iterations and \scrO \~\bigl((\beta /\mu )(1 - \rho ) -1/2 log(1/\epsilon )\bigr) communication steps, where \beta measures the degree of similarity of agents' losses and \mu is the strong convexity constant of F. Therefore, when \beta /\mu < \kappa g, the use of high-order surrogates yields provably faster rates than those achievable by first-order models; this is without exchanging any Hessian matrix over the network.
AB - We study distributed multiagent optimization over graphs. We consider the minimization of F + G subject to convex constraints, where F is the smooth strongly convex sum of the agent's losses and G is a nonsmooth convex function. We build on the SONATA algorithm: the algorithm employs the use of surrogate objective functions in the agents' subproblems (thus going beyond linearization, such as proximal-gradient) coupled with a perturbed consensus mechanism that aims to locally track the gradient of F. SONATA achieves precision \epsilon > 0 on the objective value in \scrO (\kappa g log(1/\epsilon )) gradient computations at each node and \scrO \~\bigl(\kappa g(1 - \rho ) -1/2 log(1/\epsilon )\bigr) communication steps, where \kappa g is the condition number of F and \rho characterizes the connectivity of the network. This is the first linear rate result for distributed composite optimization; it also improves on existing (nonaccelerated) schemes just minimizing F, whose rate depends on much larger quantities than \kappa g. When the loss functions of the agents are similar, due to statistical data similarity or otherwise, SONATA employing high-order surrogates achieves precision \epsilon > 0 in \scrO \bigl((\beta /\mu ) log(1/\epsilon )\bigr) iterations and \scrO \~\bigl((\beta /\mu )(1 - \rho ) -1/2 log(1/\epsilon )\bigr) communication steps, where \beta measures the degree of similarity of agents' losses and \mu is the strong convexity constant of F. Therefore, when \beta /\mu < \kappa g, the use of high-order surrogates yields provably faster rates than those achievable by first-order models; this is without exchanging any Hessian matrix over the network.
UR - http://www.scopus.com/inward/record.url?scp=85130634943&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85130634943&partnerID=8YFLogxK
U2 - 10.1137/19M1259973
DO - 10.1137/19M1259973
M3 - Article
AN - SCOPUS:85130634943
SN - 1052-6234
VL - 32
SP - 354
EP - 385
JO - SIAM Journal on Optimization
JF - SIAM Journal on Optimization
IS - 2
ER -