The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Updates
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Mini-batch stochastic gradient descent (SGD) is state of the art in large scale distributed training. The scheme can reach a linear speedup with respect to the number of workers, but this is rarely seen in practice as the scheme often suffers from large ne ...
We propose a stochastic gradient framework for solving stochastic composite convex optimization problems with (possibly) infinite number of linear inclusion constraints that need to be satisfied almost surely. We use smoothing and homotopy techniques to ha ...
The interest for distributed stochastic optimization has raised to train complex Machine Learning models with more data on distributed systems. Increasing the computation power speeds up the training but it faces a communication bottleneck between workers ...
Our brain continuously self-organizes to construct and maintain an internal representation of the world based on the information arriving through sensory stimuli. Remarkably, cortical areas related to different sensory modalities appear to share the same f ...
Pickands constants play a crucial role in the asymptotic theory of Gaussian processes. They are commonly defined as the limits of a sequence of expectations involving fractional Brownian motions and, as such, their exact value is often unknown. Recently, D ...
Data augmentation is the process of generating samples by transforming training data, with the target of improving the accuracy and robustness of classifiers. In this paper, we propose a new automatic and adaptive algorithm for choosing the transformations ...
The strong growth condition (SGC) is known to be a sufficient condition for linear convergence of the stochastic gradient method using a constant step-size γ (SGM-CS). In this paper, we provide a necessary condition, for the linear convergence of SGM-CS, t ...
We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for quadratic non-strongly-convex problems may be reformulated as constant parameter secondorder difference equation algorithms, where stability of the system is ...
Time series forecasting for streaming data plays an important role in many real applications, ranging from IoT systems, cyber-networks, to industrial systems and healthcare. However the real data is often complicated with anomalies and change points, which ...
The article examines in some detail the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime. The results establish that momentum methods are equivalent to the stand ...