(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability
Publications associées (33)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
In this paper we fully describe the trajectory of gradient flow over diagonal linear networks in the limit of vanishing initialisation. We show that the limiting flow successively jumps from a saddle of the training loss to another until reaching the minim ...
We consider the problem of learning a target function corresponding to a deep, extensive-width, non-linear neural network with random Gaussian weights. We consider the asymptotic limit where the number of samples, the input dimension and the network width ...
A computer-implemented method for reconstructing/recovering high-resolution visible light spectral data at a target resolution d, that comprises obtaining a configuration of a low- resolution multi-channel imaging sensor of resolution p, the configuration ...
2022
,
One of the objectives of Pharmacometry (PMX) population modeling is the identification of significant and clinically relevant relationships between parameters and covariates. Here, we demonstrate how this complex selection task could benefit from supervise ...
2021
, , ,
We study the problem of one-dimensional regression of data points with total-variation (TV) regularization (in the sense of measures) on the second derivative, which is known to promote piecewise-linear solutions with few knots. While there are efficient a ...
Elsevier2021
, , ,
Coordinate descent with random coordinate selection is the current state of the art for many large scale optimization problems. However, greedy selection of the steepest coordinate on smooth problems can yield convergence rates independent of the dimension ...
MICROTOME PUBLISHING2019
,
We consider the problem of estimating the slope function in a functional regression with a scalar response and a functional covariate. This central problem of functional data analysis is well known to be ill-posed, thus requiring a regularised estimation p ...
We study 1D continuous-domain inverse problems for multicomponent signals. The prior assumption on these signals is that each component is sparse in a different dictionary specified by a regularization operators. We introduce a hybrid regularization functi ...
We study one-dimensional continuous-domain inverse problems with multiple generalized total-variation regularization, which involves the joint use of several regularization operators. Our starting point is a new representer theorem that states that such in ...
A functional (lagged) time series regression model involves the regression of scalar response time series on a time series of regressors that consists of a sequence of random functions. In practice, the underlying regressor curve time series are not always ...