Publication

Optimal Convergence for Distributed Learning with Stochastic Gradient Methods and Spectral Algorithms

Related publications (66)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Learning Positive Functions with Pseudo Mirror Descent

Negar Kiyavash

The nonparametric learning of positive-valued functions appears widely in machine learning, especially in the context of estimating intensity functions of point processes. Yet, existing approaches either require computing expensive projections or semidefin ...

Curran Associates, Inc.2019

Local SGD Converges Fast and Communicates Little

Sebastian Urban Stich

Mini-batch stochastic gradient descent (SGD) is state of the art in large scale distributed training. The scheme can reach a linear speedup with respect to the number of workers, but this is rarely seen in practice as the scheme often suffers from large ne ...

2019

Kernel Regression for Graph Signal Prediction in Presence of Sparse Noise

Pascal Frossard, Arun Venkitaraman

In presence of sparse noise we propose kernel regression for predicting output vectors which are smooth over a given graph. Sparse noise models the training outputs being corrupted either with missing samples or large perturbations. The presence of sparse ...

IEEE2019

Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods

Volkan Cevher, Junhong Lin

We study generalization properties of distributed algorithms in the setting of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We investigate distributed stochastic gradient methods (SGM), with mini-batches and multi-passes over th ...

2018

Neural Tanget Kernel: Convergence and Generalization in Neural Networks

At initialization, artificial neural networks (ANNs) are equivalent to Gaussian processes in the infinite-width limit [12, 9], thus connecting them to kernel methods. We prove that the evolution of an ANN during training can also be described by a kernel: ...

NEURAL INFORMATION PROCESSING SYSTEMS (NIPS)2018

Data-driven reduced order modeling for time-dependent problems

Jan Sickmann Hesthaven, Mengwu Guo

A data-driven reduced basis (RB) method for parametrized time-dependent problems is proposed. This method requires the offline preparation of a database comprising the time history of the full-order solutions at parameter locations. Based on the full-order ...

2018

Methodology And Convergence Rates For Functional Time Series Regression

Victor Panaretos, Tung Huy Pham

The functional linear model extends the notion of linear regression to the case where the response and covariates are iid elements of an infinite-dimensional Hilbert space. The unknown to be estimated is a Hilbert-Schmidt operator, whose inverse is by defi ...

STATISTICA SINICA2018

Coordinate-Descent Diffusion Learning by Networked Agents

Ali H. Sayed, Bicheng Ying, Chengcheng Wang

This paper examines the mean-square error performance of diffusion stochastic algorithms under a generalized coordinate-descent scheme. In this setting, the adaptation step by each agent is limited to a random subset of the coordinates of its stochastic gr ...

Ieee-Inst Electrical Electronics Engineers Inc2018

On the linear convergence of the stochastic gradient method with constant step-size

Volkan Cevher, Cong Bang Vu

The strong growth condition (SGC) is known to be a sufficient condition for linear convergence of the stochastic gradient method using a constant step-size γ (SGM-CS). In this paper, we provide a necessary condition, for the linear convergence of SGM-CS, t ...

2018

Multi-task additive models with shared transfer functions based on dictionary learning

Pascal Frossard, Alhussein Fawzi

Additive models form a widely popular class of regression models which represent the relation between covariates and response variables as the sum of low-dimensional transfer functions. Besides flexibility and accuracy, a key benefit of these models is the ...

Institute of Electrical and Electronics Engineers2017