TY - GEN
T1 - Communication-Efficient Federated Learning for Heterogeneous Edge Devices Based on Adaptive Gradient Quantization
AU - Liu, Heting
AU - He, Fang
AU - Cao, Guohong
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Federated learning (FL) enables geographically dispersed edge devices (i.e., clients) to learn a global model without sharing the local datasets, where each client performs gradient descent with its local data and uploads the gradients to a central server to update the global model. However, FL faces massive communication overhead resulted from uploading the gradients in each training round. To address this problem, most existing research compresses the gradients with fixed and unified quantization for all the clients, which neither seeks adaptive quantization due to the varying gradient norms at different rounds, nor exploits the heterogeneity of the clients to accelerate FL. In this paper, we propose a novel adaptive and heterogeneous gradient quantization algorithm (AdaGQ) for FL to minimize the wall-clock training time from two aspects: i) adaptive quantization which exploits the change of gradient norm to adjust the quantization resolution in each training round; and ii) heterogeneous quantization which assigns lower quantization resolution to slow clients to align their training time with other clients to mitigate the communication bottleneck, and higher quantization resolution to fast clients to achieve a better communication efficiency and accuracy tradeoff. Evaluations based on various models and datasets validate the benefits of AdaGQ, reducing the total training time by up to 52.1% compared to baseline algorithms (e.g., FedAvg, QSGD).
AB - Federated learning (FL) enables geographically dispersed edge devices (i.e., clients) to learn a global model without sharing the local datasets, where each client performs gradient descent with its local data and uploads the gradients to a central server to update the global model. However, FL faces massive communication overhead resulted from uploading the gradients in each training round. To address this problem, most existing research compresses the gradients with fixed and unified quantization for all the clients, which neither seeks adaptive quantization due to the varying gradient norms at different rounds, nor exploits the heterogeneity of the clients to accelerate FL. In this paper, we propose a novel adaptive and heterogeneous gradient quantization algorithm (AdaGQ) for FL to minimize the wall-clock training time from two aspects: i) adaptive quantization which exploits the change of gradient norm to adjust the quantization resolution in each training round; and ii) heterogeneous quantization which assigns lower quantization resolution to slow clients to align their training time with other clients to mitigate the communication bottleneck, and higher quantization resolution to fast clients to achieve a better communication efficiency and accuracy tradeoff. Evaluations based on various models and datasets validate the benefits of AdaGQ, reducing the total training time by up to 52.1% compared to baseline algorithms (e.g., FedAvg, QSGD).
UR - http://www.scopus.com/inward/record.url?scp=85163397550&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85163397550&partnerID=8YFLogxK
U2 - 10.1109/INFOCOM53939.2023.10228970
DO - 10.1109/INFOCOM53939.2023.10228970
M3 - Conference contribution
AN - SCOPUS:85163397550
T3 - Proceedings - IEEE INFOCOM
BT - INFOCOM 2023 - IEEE Conference on Computer Communications
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 42nd IEEE International Conference on Computer Communications, INFOCOM 2023
Y2 - 17 May 2023 through 20 May 2023
ER -