TY - JOUR
T1 - On the Data Heterogeneity in Adaptive Federated Learning
AU - Wang, Yujia
AU - Chen, Jinghui
N1 - Publisher Copyright:
© 2024, Transactions on Machine Learning Research. All rights reserved.
PY - 2024
Y1 - 2024
N2 - Adaptive federated learning, which benefits from the characteristic of both adaptive op-timizer and federated training paradigm, has recently gained lots of attention. Despite achieving outstanding performances on tasks with heavy-tail stochastic gradient noise dis-tributions, adaptive federated learning also suffers from the same data heterogeneity issue as standard federated learning: heterogeneous data distribution across the clients can largely deteriorate the convergence of adaptive federated learning. In this paper, we propose a novel adaptive federated learning framework with local gossip averaging to address this issue. Particularly, we introduce a client re-sampling mechanism and peer-to-peer gossip communications between local clients to mitigate the data heterogeneity without requiring additional gradient computation costs. We theoretically prove the fast convergence for our proposed method under non-convex stochastic settings and empirically demonstrate its su-perior performances over vanilla adaptive federated learning with client sampling. Moreover, we extend our framework to a communication-efficient variant, in which clients are divided into disjoint clusters determined by their connectivity or communication capabilities. We exclusively perform local gossip averaging within these clusters, leading to an enhancement in network communication efficiency for our proposed method.
AB - Adaptive federated learning, which benefits from the characteristic of both adaptive op-timizer and federated training paradigm, has recently gained lots of attention. Despite achieving outstanding performances on tasks with heavy-tail stochastic gradient noise dis-tributions, adaptive federated learning also suffers from the same data heterogeneity issue as standard federated learning: heterogeneous data distribution across the clients can largely deteriorate the convergence of adaptive federated learning. In this paper, we propose a novel adaptive federated learning framework with local gossip averaging to address this issue. Particularly, we introduce a client re-sampling mechanism and peer-to-peer gossip communications between local clients to mitigate the data heterogeneity without requiring additional gradient computation costs. We theoretically prove the fast convergence for our proposed method under non-convex stochastic settings and empirically demonstrate its su-perior performances over vanilla adaptive federated learning with client sampling. Moreover, we extend our framework to a communication-efficient variant, in which clients are divided into disjoint clusters determined by their connectivity or communication capabilities. We exclusively perform local gossip averaging within these clusters, leading to an enhancement in network communication efficiency for our proposed method.
UR - http://www.scopus.com/inward/record.url?scp=85219582885&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85219582885&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85219582885
SN - 2835-8856
VL - 2024
JO - Transactions on Machine Learning Research
JF - Transactions on Machine Learning Research
ER -