Extrapolation for Large-batch Training in Deep Learning
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Data augmentation is the process of generating samples by transforming training data, with the target of improving the accuracy and robustness of classifiers. In this paper, we propose a new automatic and adaptive algorithm for choosing the transformations ...
The strong growth condition (SGC) is known to be a sufficient condition for linear convergence of the stochastic gradient method using a constant step-size γ (SGM-CS). In this paper, we provide a necessary condition, for the linear convergence of SGM-CS, t ...
The minimization of empirical risks over finite sample sizes is an important problem in large-scale machine learning. A variety of algorithms has been proposed in the literature to alleviate the computational burden per iteration at the expense of converge ...
Interest in deep probabilistic graphical models has increased in recent years, due to their state-of-the-art perfor- mance on many machine learning applications. Such models are typically trained with the stochastic gradient method, which can take a signif ...
In this paper, we revisit an efficient algorithm for noisy group testing in which each item is decoded separately (Malyutov and Mateev, 1980), and develop novel performance guarantees via an information-theoretic framework for general noise models. For the ...
Deep learning presents notorious computational challenges. These challenges in- clude, but are not limited to, the non-convexity of learning objectives and estimat- ing the quantities needed for optimization algorithms, such as gradients. While we do not a ...
Restricted Boltzmann Machines (RBMs) are widely used as building blocks for deep learning models. Learning typically proceeds by using stochastic gradient descent, and the gradients are estimated with sampling methods. However, the gradient estimation is a ...
Our brain continuously self-organizes to construct and maintain an internal representation of the world based on the information arriving through sensory stimuli. Remarkably, cortical areas related to different sensory modalities appear to share the same f ...
Data-based control design methods most often consist of iterative adjustment of the controller&psila;s parameters towards the parameter values which minimize an H2 performance criterion. Typically, batches of input-output data collected from the system are ...
The convex l(1)-regularized log det divergence criterion has been shown to produce theoretically consistent graph learning. However, this objective function is challenging since the l(1)-regularization is nonsmooth, the log det objective is not globally Li ...