Publications related to Stochastic Gradient Descent for Spectral Embedding with Implicit Orthogonality Constraint

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Nicolas Henri Bernard Flammarion, Hristo Georgiev Papazov, Scott William Pesme

In this work, we investigate the effect of momentum on the optimisation trajectory of gradient descent. We leverage a continuous-time approach in the analysis of momentum gradient descent with step size

\gamma

and momentum parameter

\beta

that allows u ...

2024

Universal and adaptive methods for robust stochastic optimization

Ali Kavis

Within the context of contemporary machine learning problems, efficiency of optimization process depends on the properties of the model and the nature of the data available, which poses a significant problem as the complexity of either increases ad infinit ...

EPFL2023

Augmented Lagrangian Methods for Provable and Scalable Machine Learning

Mehmet Fatih Sahin

Non-convex constrained optimization problems have become a powerful framework for modeling a wide range of machine learning problems, with applications in k-means clustering, large- scale semidefinite programs (SDPs), and various other tasks. As the perfor ...

EPFL2023

Phenomenological theory of variational quantum ground-state preparation

Ivano Tavernelli, Giuseppe Carleo

The variational approach is a cornerstone of computational physics, considering both conventional and quantum computing computational platforms. The variational quantum eigensolver algorithm aims to prepare the ground state of a Hamiltonian exploiting para ...

College Pk2023

Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function

Patrick Thiran, Negar Kiyavash, Saber Salehkaleybar

We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of functions satisfying gradient dominance property with

1\le\alpha\le2

which holds in a wide range of applications in machine learning and signal processing. This conditio ...

NeurIPS2022

Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization

Volkan Cevher, Alp Yurtsever, Maria-Luiza Vladarean

We propose a stochastic conditional gradient method (CGM) for minimizing convex finitesum objectives formed as a sum of smooth and non-smooth terms. Existing CGM variants for this template either suffer from slow convergence rates, or require carefully inc ...

2022

Sublinear Algorithms for Spectral Graph Clustering

Aidasadat Mousavifar

This thesis focuses on designing spectral tools for graph clustering in sublinear time. With the emergence of big data, many traditional polynomial time, and even linear time algorithms have become prohibitively expensive. Processing modern datasets requir ...

EPFL2021

Adaptation in Stochastic Algorithms: From Nonsmooth Optimization to Min-Max Problems and Beyond

Ahmet Alacaoglu

Stochastic gradient descent (SGD) and randomized coordinate descent (RCD) are two of the workhorses for training modern automated decision systems. Intriguingly, convergence properties of these methods are not well-established as we move away from the spec ...

EPFL2021

STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization

Volkan Cevher, Ali Kavis

In this work we investigate stochastic non-convex optimization problems wherethe objective is an expectation over smooth loss functions, and the goal is to find an approximate stationary point. The most popular approach to handling such problems is varianc ...

2021

Disparity Between Batches as a Signal for Early Stopping

Patrick Thiran, Mahsa Forouzesh

We propose a metric for evaluating the generalization ability of deep neural networks trained with mini-batch gradient descent. Our metric, called gradient disparity, is the l2 norm distance between the gradient vectors of two mini-batches drawn from the t ...

Springer2021

Stochastic Gradient Descent for Spectral Embedding with Implicit Orthogonality Constraint

Graph Chatbot

Chat with Graph Search

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Universal and adaptive methods for robust stochastic optimization

Augmented Lagrangian Methods for Provable and Scalable Machine Learning

Phenomenological theory of variational quantum ground-state preparation

Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function

Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization

Sublinear Algorithms for Spectral Graph Clustering

Adaptation in Stochastic Algorithms: From Nonsmooth Optimization to Min-Max Problems and Beyond

STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization

Disparity Between Batches as a Signal for Early Stopping

Disparity Between Batches as a Signal for Early Stopping

STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization

Adaptation in Stochastic Algorithms: From Nonsmooth Optimization to Min-Max Problems and Beyond

Sublinear Algorithms for Spectral Graph Clustering

Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization

Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function

Phenomenological theory of variational quantum ground-state preparation

Augmented Lagrangian Methods for Provable and Scalable Machine Learning

Universal and adaptive methods for robust stochastic optimization

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks