Publication

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Publications associées (37)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

On the influence of momentum acceleration on online learning

Ali H. Sayed, Bicheng Ying, Kun Yuan

This paper examines the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime. The results establish that momentum methods are equivalent to the standard stochastic g ...

IEEE2016

On the influence of momentum acceleration on online learning

Ali H. Sayed, Bicheng Ying, Kun Yuan

The article examines in some detail the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime. The results establish that momentum methods are equivalent to the stand ...

2016

Adaptive data augmentation for image classification

Pascal Frossard, Alhussein Fawzi

Data augmentation is the process of generating samples by transforming training data, with the target of improving the accuracy and robustness of classifiers. In this paper, we propose a new automatic and adaptive algorithm for choosing the transformations ...

IEEE2016

Theory of representation learning in cortical neural networks

Carlos Stein Naves de Brito

Our brain continuously self-organizes to construct and maintain an internal representation of the world based on the information arriving through sensory stimuli. Remarkably, cortical areas related to different sensory modalities appear to share the same f ...

EPFL2016

The Interchangeability of Learning Rate and Gain in Backpropagation Neural Networks

The backpropagation algorithm is widely used for training multilayer neural networks. In this publication the gain of its activation function(s) is investigated. In specific, it is proven that changing the gain of the activation function is equivalent to c ...

MIT Press1996

Results on the Steepness in Backpropagation Neural Networks

The backpropagation algorithm is widely used for training multilayer neural networks. In this publication the steepness of its activation functions is investigated. In specific, it is discussed that changing the steepness of the activation function is equi ...

1994

Stochastic Gradient Descent for Spectral Embedding with Implicit Orthogonality Constraint

Pascal Frossard, Mireille El Gheche, Giovanni Chierchia

In this paper, we propose a scalable algorithm for spectral embedding. The latter is a standard tool for graph clustering. However, its computational bottleneck is the eigendecomposition of the graph Laplacian matrix, which prevents its application to larg ...

IEEE0