Communication trade-offs for Local-SGD with large step size

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Synchronous mini-batch SGD is state-of-the-art for large-scale distributed machine learning. However, in practice, its convergence is bottlenecked by slow communication rounds between worker nodes. A natural solution to reduce communication is to use the "local-SGD" model in which the workers train their model independently and synchronize every once in a while. This algorithm improves the computation-communication trade-off but its convergence is not understood very well. We propose a non-asymptotic error analysis, which enables comparison to one-shot averaging i.e., a single communication round among independent workers, and mini-batch averagingi.e., communicating at every step. We also provide adaptive lower bounds on the communication frequency for large step-sizes (t(-alpha), alpha is an element of(1/2, 1)) and show that local-SGD reduces communication by a factor of O(root T/P-3/2), with T the total number of gradients and P machines.

Communication trade-offs for Local-SGD with large step size

Graph Chatbot

Chattez avec Graph Search

Optimization Algorithms for Decentralized, Distributed and Collaborative Machine Learning

Few-shot Learning for Efficient and Effective Machine Learning Model Adaptation

Random matrix methods for high-dimensional machine learning models

Optimization Algorithms for Decentralized, Distributed and Collaborative Machine Learning

Few-shot Learning for Efficient and Effective Machine Learning Model Adaptation

Random matrix methods for high-dimensional machine learning models