Distributed (ATC) Gradient Descent for High Dimension Sparse Regression

Yao Ji, Gesualdo Scutari, Ying Sun, Harsha Honnappa

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

We study linear regression from data distributed over a network of agents (with no server node) by means of LASSO estimation, in high-dimension, which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the problem, statistical and computational guarantees of most of them remain unclear in high dimension. This paper provides a first statistical study of the Distributed Gradient Descent (DGD) in the Adapt-Then-Combine (ATC) form. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions-which hold with high probability for standard data generation models-suitable conditions on the network connectivity and algorithm tuning, DGD-ATC converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model. In the worst-case scenario, the total number of communications to statistical optimality grows logarithmically with the ambient dimension, which improves on the communication complexity of DGD in the Combine-Then-Adapt (CTA) form, scaling linearly with the dimension. This reveals that mixing gradient information among agents, as DGD-ATC does, is critical in high-dimensions to obtain favorable rate scalings.

Original languageEnglish (US)
Pages (from-to)5253-5276
Number of pages24
JournalIEEE Transactions on Information Theory
Volume69
Issue number8
DOIs
StatePublished - Aug 1 2023

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Cite this