Abstract
Adaptive federated learning, which benefits from the characteristic of both adaptive op-timizer and federated training paradigm, has recently gained lots of attention. Despite achieving outstanding performances on tasks with heavy-tail stochastic gradient noise dis-tributions, adaptive federated learning also suffers from the same data heterogeneity issue as standard federated learning: heterogeneous data distribution across the clients can largely deteriorate the convergence of adaptive federated learning. In this paper, we propose a novel adaptive federated learning framework with local gossip averaging to address this issue. Particularly, we introduce a client re-sampling mechanism and peer-to-peer gossip communications between local clients to mitigate the data heterogeneity without requiring additional gradient computation costs. We theoretically prove the fast convergence for our proposed method under non-convex stochastic settings and empirically demonstrate its su-perior performances over vanilla adaptive federated learning with client sampling. Moreover, we extend our framework to a communication-efficient variant, in which clients are divided into disjoint clusters determined by their connectivity or communication capabilities. We exclusively perform local gossip averaging within these clusters, leading to an enhancement in network communication efficiency for our proposed method.
| Original language | English (US) |
|---|---|
| Journal | Transactions on Machine Learning Research |
| Volume | 2024 |
| State | Published - 2024 |
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition
- Artificial Intelligence
Fingerprint
Dive into the research topics of 'On the Data Heterogeneity in Adaptive Federated Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver