Masked Training of Neural Networks with Partial Gradients
Publications associées (66)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Countless signal processing applications include the reconstruction of signals from few indirect linear measurements. The design of effective measurement operators is typically constrained by the underlying hardware and physics, posing a challenging and of ...
Learning-based image coding has shown promising results during recent years. Unlike the traditional approaches to image compression, learning-based codecs exploit deep neural networks for reducing dimensionality of the input at the stage where a linear tra ...
We propose a metric for evaluating the generalization ability of deep neural networks trained with mini-batch gradient descent. Our metric, called gradient disparity, is the l2 norm distance between the gradient vectors of two mini-batches drawn from the t ...
Adaptive first-order methods in optimization are prominent in machine learning and data science owing to their ability to automatically adapt to the landscape of the function being optimized. However, their convergence guarantees are typically stated in te ...
The way our brain learns to disentangle complex signals into unambiguous concepts is fascinating but remains largely unknown. There is evidence, however, that hierarchical neural representations play a key role in the cortex. This thesis investigates biolo ...
As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed to perform parallel model training. One popular communication-compression method for data-para ...
In this dissertation, we propose gradient-based methods for characterizing model behaviour for the purposes of knowledge transfer and post-hoc model interpretation. Broadly, gradients capture the variation of some output feature of the model upon unit vari ...
In this paper, we study an emerging class of neural networks, the Morphological Neural networks, from some modern perspectives. Our approach utilizes ideas from tropical geometry and mathematical morphology. First, we state the training of a binary morphol ...
Wearable solutions based on Deep Learning (DL) for real-time ECG monitoring are a promising alternative to detect life-threatening arrhythmias. However, DL models suffer of a large memory footprint, which hampers their adoption in portable technologies. Th ...
This paper considers the Byzantine fault-tolerance problem in distributed stochastic gradient descent (D-SGD) method - a popular algorithm for distributed multi-agent machine learning. In this problem, each agent samples data points independently from a ce ...