Publication

Multi-memristive synaptic architectures for training neural networks

Irem Boybat Kara
2020
Thèse EPFL
Résumé

Highly data-centric AI workloads require that new computing paradigms be adopted because the performance of traditional CPU- and GPU-based systems are limited by data access and transfer. Training deep neural networks with millions of tunable parameters takes days or even weeks using relatively powerful heterogeneous systems, and it consumes more than hundreds of kilowatts of power. In-memory computing with memristive devices is a very promising avenue to accelerate deep learning because computations take place within the memory itself, eliminating the need to move the data around. The synaptic weights can be represented by the analog conductance states of memristive devices organized in crossbar arrays. The computationally expensive operations associated with deep learning training can be performed in place by exploiting the physical attributes and state dynamics of memristive devices and circuit laws. Memristive cores are also of particular interest due to the non-volatility, scalability, CMOS-compatibility and fast access time of the constituent devices. In addition, the multi-level storage capability of certain memristive technologies is especially attractive for increasing the information storage capacity of such cores.

Large-scale demonstrations that combine memristive synapses with digital or analog CMOS circuitry indicate the potential of in-memory computing to accelerate deep learning. However, these implementations are highly vulnerable to non-ideal memristive device behavior. In particular, the limited weight representation capability, intra-device and array-level variability and temporal variations of conductance states pose significant challenges to achieving training accuracies that are comparable to conventional von Neumann implementations. Design solutions that can address these non-idealities without introducing significant implementation complexities will be critical for future memristive systems.

This thesis proposes a novel synaptic architecture that can overcome a multitude of the aforementioned device non-idealities. In particular, it investigates the use of multiple memristive devices as a single computational primitive to represent a neural network weight and examines experimentally how such a compute primitive can improve non-desired memristive behavior. We propose a novel technique to arbitrate between the constituent devices of the synapse that can easily be implemented in hardware and adds only minimal energy overhead. We explore the proposed concept over various networks such as conventional non-spiking and spiking neural networks. The efficacy of this synaptic architecture is demonstrated for different training approaches including fully memristive and mixed-precision in-memory training by means of experiments using more than 1 million phase-change memory devices. Furthermore, we show that the proposed concept can be a key enabler to exploit binary memristive devices for deep-learning training.

À propos de ce résultat
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.
Concepts associés (33)
Réseau de neurones récurrents
Un réseau de neurones récurrents (RNN pour recurrent neural network en anglais) est un réseau de neurones artificiels présentant des connexions récurrentes. Un réseau de neurones récurrents est constitué d'unités (neurones) interconnectées interagissant non-linéairement et pour lequel il existe au moins un cycle dans la structure. Les unités sont reliées par des arcs (synapses) qui possèdent un poids. La sortie d'un neurone est une combinaison non linéaire de ses entrées.
Apprentissage profond
L'apprentissage profond ou apprentissage en profondeur (en anglais : deep learning, deep structured learning, hierarchical learning) est un sous-domaine de l’intelligence artificielle qui utilise des réseaux neuronaux pour résoudre des tâches complexes grâce à des architectures articulées de différentes transformations non linéaires. Ces techniques ont permis des progrès importants et rapides dans les domaines de l'analyse du signal sonore ou visuel et notamment de la reconnaissance faciale, de la reconnaissance vocale, de la vision par ordinateur, du traitement automatisé du langage.
Memristor
En électronique, le memristor (ou memristance) est un composant électronique passif. Il a été décrit comme le quatrième composant passif élémentaire, aux côtés du condensateur (ou capacité), du résistor (ou résistance) et de la bobine(ou inductance). Le nom est un mot-valise formé à partir des deux mots anglais memory et resistor. Un memristor stocke efficacement l’information car la valeur de sa résistance électrique change de façon permanente lorsqu’un courant est appliqué.
Afficher plus
Publications associées (67)

Exploring High-Performance and Energy-Efficient Architectures for Edge AI-Enabled Applications

Joshua Alexander Harrison Klein

The desire and ability to place AI-enabled applications on the edge has grown significantly in recent years. However, the compute-, area-, and power-constrained nature of edge devices are stressed by the needs of the AI-enabled applications, due to a gener ...
EPFL2024

Supervised learning and inference of spiking neural networks with temporal coding

Ana Stanojevic

The way biological brains carry out advanced yet extremely energy efficient signal processing remains both fascinating and unintelligible. It is known however that at least some areas of the brain perform fast and low-cost processing relying only on a smal ...
EPFL2023

2D Nanosystems: Applications of 2D Semiconductors for In-Memory Computing

Guilherme Migliato Marega

Machine learning and data processing algorithms have been thriving in finding ways of processing and classifying information by exploiting the hidden trends of large datasets. Although these emerging computational methods have become successful in today's ...
EPFL2023
Afficher plus
MOOCs associés (23)
Neuronal Dynamics 2- Computational Neuroscience: Neuronal Dynamics of Cognition
This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.
Neuronal Dynamics 2- Computational Neuroscience: Neuronal Dynamics of Cognition
This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.
Afficher plus

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.