Keyphrases
Adaptive Gradient Methods
100%
Communication Efficiency
66%
Communication Cost
66%
Distributed Learning
66%
AMSGrad
66%
First-order
33%
Training Data
33%
Stationary Point
33%
Local Labour
33%
Learning Framework
33%
Vanilla
33%
Central Server
33%
Machine Learning Models
33%
Model Update
33%
Stochastic Gradient Descent
33%
Iteration Complexity
33%
Nonconvex Optimization Problems
33%
Compression Strategy
33%
Nonconvex Stochastic Optimization
33%
Error Feedback
33%
Optimization Setting
33%
Large-scale Machine Learning
33%
Computer Science
Gradient Method
100%
Communication Cost
66%
Distributed Learning
66%
Optimization Problem
33%
Training Dataset
33%
Gradient Descent
33%
Learning Framework
33%
Stationary Point
33%