**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Perspective on training fully connected networks with resistive memories: Device requirements for multiple conductances of varying significance

Résumé

Novel Deep Neural Network (DNN) accelerators based on crossbar arrays of non-volatile memories (NVMs)-such as Phase-Change Memory or Resistive Memory-can implement multiply-accumulate operations in a highly parallelized fashion. In such systems, computation occurs in the analog domain at the location of weight data encoded into the conductances of the NVM devices. This allows DNN training of fully-connected layers to be performed faster and with less energy. Using a mixed-hardware-software experiment, we recently showed that by encoding each weight into four distinct physical devices-a "Most Significant Conductance" pair (MSP) and a "Least Significant Conductance" pair (LSP)-we can train DNNs to software-equivalent accuracy despite the imperfections of real analog memory devices. We surmised that, by dividing the task of updating and maintaining weight values between the two conductance pairs, this approach should significantly relax the otherwise quite stringent device requirements. In this paper, we quantify these relaxed requirements for analog memory devices exhibiting a saturating conductance response, assuming either an immediate or a delayed steep initial slope in conductance change. We discuss requirements on the LSP imposed by the "Open Loop Tuning" performed after each training example and on the MSP due to the "Closed Loop Tuning" performed periodically for weight transfer between the conductance pairs. Using simulations to evaluate the final generalization accuracy of a trained four-neuronlayer fully-connected network, we quantify the required dynamic range (as controlled by the size of the steep initial jump), the tolerable device-to-device variability in both maximum conductance and maximum conductance change, the tolerable pulse-to-pulse variability in conductance change, and the tolerable device yield, for both the LSP and MSP devices. We also investigate various Closed Loop Tuning strategies and describe the impact of the MSP/LSP approach on device endurance. Published by AIP Publishing.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (14)

Apprentissage profond

L'apprentissage profond ou apprentissage en profondeur (en anglais : deep learning, deep structured learning, hierarchical learning) est un sous-domaine de l’intelligence artificiel

Poids

Le poids est la force de la pesanteur, d'origine gravitationnelle et inertielle, exercée, par exemple, par la Terre sur un corps massique en raison uniquement du voisinage de la Terre. Son unité da

Electrical resistance and conductance

The electrical resistance of an object is a measure of its opposition to the flow of electric current. Its reciprocal quantity is , measuring the ease with which an electric current passes. Electrica

Publications associées (121)

Chargement

Chargement

Chargement

In this thesis, we propose new algorithms to solve inverse problems in the context of biomedical images. Due to ill-posedness, solving these problems require some prior knowledge of the statistics of the underlying images. The traditional algorithms, in the field, assume prior knowledge related to smoothness or sparsity of these images. Recently, they have been outperformed by the second generation algorithms which harness the power of neural networks to learn required statistics from training data. Even more recently, last generation deep-learning-based methods have emerged which require neither training nor training data. This thesis devises algorithms which progress through these generations. It extends these generations to novel formulations and applications while bringing more robustness. In parallel, it also progresses in terms of complexity, from proposing algorithms for problems with 1D data and an exact known forward model to the ones with 4D data and an unknown parametric forward model. We introduce five main contributions. The last three of them propose deep-learning-based latest-generation algorithms that require no prior training. 1) We develop algorithms to solve the continuous-domain formulation of inverse problems with both classical Tikhonov and total-variation regularizations. We formalize the problems, characterize the solution set, and devise numerical approaches to find the solutions. 2) We propose an algorithm that improves upon end-to-end neural-network-based second generation algorithms. In our method, a neural network is first trained as a projector on a training set, and is then plugged in as a projector inside the projected gradient descent (PGD). Since the problem is nonconvex, we relax the PGD to ensure convergence to a local minimum under some constraints. This method outperforms all the previous generation algorithms for Computed Tomography (CT). 3) We develop a novel time-dependent deep-image-prior algorithm for modalities that involve a temporal sequence of images. We parameterize them as the output of an untrained neural network fed with a sequence of latent variables. To impose temporal directionality, the latent variables are assumed to lie on a 1D manifold. The network is then tuned to minimize the data fidelity. We obtain state-of-the-art results in dynamic magnetic resonance imaging (MRI) and even recover intra-frame images. 4) We propose a novel reconstruction paradigm for cryo-electron-microscopy (CryoEM) called CryoGAN. Motivated by generative adversarial networks (GANs), we reconstruct a biomolecule's 3D structure such that its CryoEM measurements resemble the acquired data in a distributional sense. The algorithm is pose-or-likelihood-estimation-free, needs no ab initio, and is proven to have a theoretical guarantee of recovery of the true structure. 5) We extend CryoGAN to reconstruct continuously varying conformations of a structure from heterogeneous data. We parameterize the conformations as the output of a neural network fed with latent variables on a low-dimensional manifold. The method is shown to recover continuous protein conformations and their energy landscape.

We report on the use of deep learning algorithms to perform depth recovery in multiview imaging. We show that if enough training data are provided, a neural network such as multilayer perceptron can be trained to recover the depth in multiview imaging as a regression problem. Such a method can replace camera calibration since no knowledge on the camera configuration is required during training. Another advantage of deep learning for this problem, is the speed of testing; typically a few microseconds per point in the scene. This is a lot better than state-of-art algorithms that require to solve a full optimization problem. In a second part, we have studied a related problem: detecting changes in the camera setting. We have shown that deep learning classifiers can recognize amongst a few (4 or 5) camera settings based only on the projections of points on the cameras, with less than 1% classification error. This is a promising step towards the SLAM problem.

2016Martina Bodini, Giorgio Cristiano, Massimo Giordano

Hardware accelerators based on two-terminal non-volatile memories (NVMs) can potentially provide competitive speed and accuracy for the training of fully connected deep neural networks (FC-DNNs), with respect to GPUs and other digital accelerators. We recently proposed [S. Ambrogio et al., Nature, 2018] novel neuromorphic crossbar arrays, consisting of a pair of phase-change memory (PCM) devices combined with a pair of 3-Transistor 1-Capacitor (3T1C) circuit elements, so that each weight was implemented using multiple conductances of varying significance, and then showed that this weight element can train FC-DNNs to software-equivalent accuracies. Unfortunately, however, real arrays of emerging NVMs such as PCM typically include some failed devices (e.g.,

2019