TY - GEN
T1 - Cross-Iteration Coded Computing
AU - Haddadpour, Farzin
AU - Yang, Yaoqing
AU - Cadambe, Viveck
AU - Grover, Pulkit
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - We introduce the idea of cross-iteration coded computing, an approach to reducing communication costs for a large class of distributed iterative algorithms involving linear operations, including gradient descent and accelerated gradient descent for quadratic loss functions. The state-of-the-art approach for these iterative algorithms involves performing one iteration of the algorithm per round of communication among the nodes. In contrast, our approach performs multiple iterations of the underlying algorithm in a single round of communication by incorporating some redundancy storage and computation. Our algorithm works in the master-worker setting with the workers storing carefully constructed linear transformations of input matrices and using these matrices in an iterative algorithm, with the master node inverting the effect of these linear transformations. In addition to reduced communication costs, a trivial generalization of our algorithm also includes resilience to stragglers and failures. The degree of redundancy of our algorithm can be tuned based on the amount of communication and straggler resilience required. Finally, we also describe a variant of our algorithm that can flexibly recover the results based on the degree of straggling in the worker nodes. The variant allows for the performance to degrade gracefully as the number of successful (non-straggling) workers is lowered.
AB - We introduce the idea of cross-iteration coded computing, an approach to reducing communication costs for a large class of distributed iterative algorithms involving linear operations, including gradient descent and accelerated gradient descent for quadratic loss functions. The state-of-the-art approach for these iterative algorithms involves performing one iteration of the algorithm per round of communication among the nodes. In contrast, our approach performs multiple iterations of the underlying algorithm in a single round of communication by incorporating some redundancy storage and computation. Our algorithm works in the master-worker setting with the workers storing carefully constructed linear transformations of input matrices and using these matrices in an iterative algorithm, with the master node inverting the effect of these linear transformations. In addition to reduced communication costs, a trivial generalization of our algorithm also includes resilience to stragglers and failures. The degree of redundancy of our algorithm can be tuned based on the amount of communication and straggler resilience required. Finally, we also describe a variant of our algorithm that can flexibly recover the results based on the degree of straggling in the worker nodes. The variant allows for the performance to degrade gracefully as the number of successful (non-straggling) workers is lowered.
UR - http://www.scopus.com/inward/record.url?scp=85062823050&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85062823050&partnerID=8YFLogxK
U2 - 10.1109/ALLERTON.2018.8635933
DO - 10.1109/ALLERTON.2018.8635933
M3 - Conference contribution
AN - SCOPUS:85062823050
T3 - 2018 56th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2018
SP - 196
EP - 203
BT - 2018 56th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 56th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2018
Y2 - 2 October 2018 through 5 October 2018
ER -