**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Conjugate gradient method

Summary

In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-definite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems.
The conjugate gradient method can also be used to solve unconstrained optimization problems such as energy minimization. It is commonly attributed to Magnus Hestenes and Eduard Stiefel, who programmed it on the Z4, and extensively researched it.
The biconjugate gradient method provides a generalization to non-symmetric matrices. Various nonlinear conjugate gradient methods seek minima of nonlinear optimization problems.
Description of the problem addressed by conju

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications

Loading

Related people

Loading

Related units

Loading

Related concepts

Loading

Related courses

Loading

Related lectures

Loading

Related people (9)

Related publications (55)

Loading

Loading

Loading

Related concepts (18)

Iterative method

In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which th

Mathematical optimization

Mathematical optimization (alternatively spelled optimisation) or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternative

Preconditioner

In mathematics, preconditioning is the application of a transformation, called the preconditioner, that conditions a given problem into a form that is more suitable for numerical solving methods. Pr

Related units (4)

Related courses (37)

EE-556: Mathematics of data: from theory to computation

This course provides an overview of key advances in continuous optimization and statistical analysis for machine learning. We review recent learning formulations and models as well as their guarantees, describe scalable solution techniques and algorithms, and illustrate the trade-offs involved.

MATH-456: Numerical analysis and computational mathematics

The course provides an introduction to scientific computing. Several numerical methods are presented for the computer solution of mathematical problems arising in different applications. The software MATLAB is used to solve the problems and verify the theoretical properties of the numerical methods.

MATH-351: Advanced numerical analysis

The student will learn state-of-the-art algorithms for solving differential equations. The analysis and implementation of these algorithms will be discussed in some detail.

The present report addresses the rigorous study of convergence of a stabilized finite volume element method applied to stationary and time dependent Stokes problems. Some preliminary results about the existence and uniqueness of solution at continuous and discrete level are provided for stabilized finite element and finite volume element formulations when using two different stabilization techniques, and then we prove consistency of the methods for the time-independent and time-dependent problems, and derive optimal convergence rates.

2009Introduction of optimisation problems in which the objective function is black box or obtaining the gradient is infeasible, has recently raised interest in zeroth-order optimisation methods. As an example finding adversarial examples for Deep Learning models (Chen et al. (2017); Moosavi-Dezfooli et al. (2016)) is one of the most common applications in which zeroth-order methods could be used. These optimisation methods use only function values at certain points to estimate the gradient. Most current approaches iteratively sample a random search direction along which they compute an estimation of the gradient (Nesterov and Spokoiny (2017); Conn et al. (2009); Wibisono et al. (2012)). However, due to the high variance in the search direction, these methods usually need d times more iterations than the standard gradient methods, where d is the dimensionality of the problem. So it seems that the main effort for improving the zeroth-order methods should be in reducing the variance of the gradient estimate. In this work we will analyse the gradient-free oracle which uses random directions sampled form a Gaussian distribution. Our analysis shows that in smooth and strongly convex setting, we have a convergence rate of O( d/T) which clearly shows the dependency to the dimension of the problem. Furthermore we propose some variance reduction methods to make the zeroth-order optimisation faster. We experiment our proposed methods in Python to compare their convergence in stochastic and non-stochastic setting. Our empirical results show that in a setting that number of allowed function evaluation is fixed, using a variance reduction method (e.g. momentum) can make the convergence of zeroth-order methods happen faster.

2019Nicolas Boumal, Christopher Arnold Criscitiello

We describe the first gradient methods on Riemannian manifolds to achieve accelerated rates in the non-convex case. Under Lipschitz assumptions on the Riemannian gradient and Hessian of the cost function, these methods find approximate first-order critical points faster than regular gradient descent. A randomized version also finds approximate second-order critical points. Both the algorithms and their analyses build extensively on existing work in the Euclidean case. The basic operation consists in running the Euclidean accelerated gradient descent method (appropriately safe-guarded against non-convexity) in the current tangent space, then moving back to the manifold and repeating. This requires lifting the cost function from the manifold to the tangent space, which can be done for example through the Riemannian exponential map. For this approach to succeed, the lifted cost function (called the pullback) must retain certain Lipschitz properties. As a contribution of independent interest, we prove precise claims to that effect, with explicit constants. Those claims are affected by the Riemannian curvature of the manifold, which in turn affects the worst-case complexity bounds for our optimization algorithms.

Related lectures (122)