**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# Accelerated SGD for Non-Strongly-Convex Least Squares

Abstract

We consider stochastic approximation for the least squares regression problem in the non-strongly convex setting. We present the first practical algorithm that achieves the optimal prediction error rates in terms of dependence on the noise of the problem, as $O(d/t)$ while accelerating the forgetting of the initial conditions to $O(d/t^2)$. Our new algorithm is based on a simple modification of the accelerated gradient descent. We provide convergence results for both the averaged and the last iterate of the algorithm. In order to describe the tightness of these new bounds, we present a matching lower bound in the noiseless setting and thus show the optimality of our algorithm.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications (3)

Related concepts (8)

Gradient descent

In mathematics, gradient descent (also often called steepest descent) is a iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the gradient (or approximate gradient) of the function at the current point, because this is the direction of steepest descent. Conversely, stepping in the direction of the gradient will lead to a local maximum of that function; the procedure is then known as gradient ascent.

Algorithm

In mathematics and computer science, an algorithm (ˈælɡərɪðəm) is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing calculations and data processing. More advanced algorithms can use conditionals to divert the code execution through various routes (referred to as automated decision-making) and deduce valid inferences (referred to as automated reasoning), achieving automation eventually.

Least squares

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the residuals (a residual being the difference between an observed value and the fitted value provided by a model) made in the results of each individual equation. The most important application is in data fitting.

This paper considers the multi-agent linear least-squares problem in a server-agent network architecture. The system comprises multiple agents, each with a set of local data points. The agents are con

Nicolas Henri Bernard Flammarion, Aditya Vardhan Varre, Loucas Pillaud-Vivien

Motivated by the recent successes of neural networks that have the ability to fit the data perfectly \emph{and} generalize well, we study the noiseless model in the fundamental least-squares setup. We

2021Finding convergence rates for numerical optimization algorithms is an important task, because it gives a justification to their use in solving practical problems, while also providing a way to compare

2017