**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Biologically plausible deep learning – but how far can we go with shallow networks?

Résumé

Training deep neural networks with the error backpropagation algorithm is considered implausible from a biological perspective. Numerous recent publications suggest elaborate models for biologically plausible variants of deep learning, typically defining success as reaching around 98% test accuracy on the MNIST data set. Here, we investigate how far we can go on digit (MNIST) and object (CIFAR10) classification with biologically plausible, local learning rules in a network with one hidden layer and a single readout layer. The hidden layer weights are either fixed (random or random Gabor filters) or trained with unsupervised methods (Principal/Independent Component Analysis or Sparse Coding) that can be implemented by local learning rules. The readout layer is trained with a supervised, local learning rule. We first implement these models with rate neurons. This comparison reveals, first, that unsupervised learning does not lead to better performance than fixed random projections or Gabor filters for large hidden layers. Second, networks with localized receptive fields perform significantly better than networks with all-to-all connectivity and can reach backpropagation performance on MNIST. We then implement two of the networks - fixed, localized, random & random Gabor filters in the hidden layer - with spiking leaky integrate-and-fire neurons and spike timing dependent plasticity to train the readout layer. These spiking models achieve > 98.2% test accuracy on MNIST, which is close to the performance of rate networks with one hidden layer trained with backpropagation. The performance of our shallow network models is comparable to most current biologically plausible models of deep learning. Furthermore, our results with a shallow spiking network provide an important reference and suggest the use of datasets other than MNIST for testing the performance of future models of biologically plausible deep learning.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (26)

Publications associées (144)

Plasticité fonction du temps d'occurrence des impulsions

La (en Spike-timing-dependent plasticity, STDP) est un processus de modification du poids des synapses. Cette modification dépend du moment de déclenchement du potentiel d'action dans les neurones

Réseau de neurones artificiels

Un réseau de neurones artificiels, ou réseau neuronal artificiel, est un système dont la conception est à l'origine schématiquement inspirée du fonctionnement des neurones biologique

Neural network

A neural network can refer to a neural circuit of biological neurons (sometimes also called a biological neural network), a network of artificial neurons or nodes in the case of an artificial neur

Chargement

Chargement

Chargement

Our brain continuously self-organizes to construct and maintain an internal representation of the world based on the information arriving through sensory stimuli. Remarkably, cortical areas related to different sensory modalities appear to share the same functional unit, the neuron, and develop through the same learning mechanism, synaptic plasticity. It motivates the conjecture of a unifying theory to explain cortical representational learning across sensory modalities. In this thesis we present theories and computational models of learning and optimization in neural networks, postulating functional properties of synaptic plasticity that support the apparent universal learning capacity of cortical networks. In the past decades, a variety of theories and models have been proposed to describe receptive field formation in sensory areas. They include normative models such as sparse coding, and bottom-up models such as spike-timing dependent plasticity. We bring together candidate explanations by demonstrating that in fact a single principle is sufficient to explain receptive field development. First, we show that many representative models of sensory development are in fact implementing variations of a common principle: nonlinear Hebbian learning. Second, we reveal that nonlinear Hebbian learning is sufficient for receptive field formation through sensory inputs. A surprising result is that our findings are independent of specific details, and allow for robust predictions of the learned receptive fields. Thus nonlinear Hebbian learning and natural statistics can account for many aspects of receptive field formation across models and sensory modalities. The Hebbian learning theory substantiates that synaptic plasticity can be interpreted as an optimization procedure, implementing stochastic gradient descent. In stochastic gradient descent inputs arrive sequentially, as in sensory streams. However, individual data samples have very little information about the correct learning signal, and it becomes a fundamental problem to know how many samples are required for reliable synaptic changes. Through estimation theory, we develop a novel adaptive learning rate model, that adapts the magnitude of synaptic changes based on the statistics of the learning signal, enabling an optimal use of data samples. Our model has a simple implementation and demonstrates improved learning speed, making this a promising candidate for large artificial neural network applications. The model also makes predictions on how cortical plasticity may modulate synaptic plasticity for optimal learning. The optimal sampling size for reliable learning allows us to estimate optimal learning times for a given model. We apply this theory to derive analytical bounds on times for the optimization of synaptic connections. First, we show this optimization problem to have exponentially many saddle-nodes, which lead to small gradients and slow learning. Second, we show that the number of input synapses to a neuron modulates the magnitude of the initial gradient, determining the duration of learning. Our final result reveals that the learning duration increases supra-linearly with the number of synapses, suggesting an effective limit on synaptic connections and receptive field sizes in developing neural networks.

Wulfram Gerstner, Samuel Pavio Muscinelli, Tilo Schwalger

While most models of randomly connected neural networks assume single-neuron models with simple dynamics, neurons in the brain exhibit complex intrinsic dynamics over multiple timescales. We analyze how the dynamical properties of single neurons and recurrent connections interact to shape the effective dynamics in large randomly connected networks. A novel dynamical mean-field theory for strongly connected networks of multi-dimensional rate neurons shows that the power spectrum of the network activity in the chaotic phase emerges from a nonlinear sharpening of the frequency response function of single neurons. For the case of two-dimensional rate neurons with strong adaptation, we find that the network exhibits a state of resonant chaos, characterized by robust, narrow-band stochastic oscillations. The coherence of stochastic oscillations is maximal at the onset of chaos and their correlation time scales with the adaptation timescale of single units. Surprisingly, the resonance frequency can be predicted from the properties of isolated neurons, even in the presence of heterogeneity in the adaptation parameters. In the presence of these internally-generated chaotic fluctuations, the transmission of weak, low-frequency signals is strongly enhanced by adaptation, whereas signal transmission is not influenced by adaptation in the non-chaotic regime. Our theoretical framework can be applied to other mechanisms at the level of single neurons, such as synaptic filtering, refractoriness or spike synchronization. These results advance our understanding of the interaction between the dynamics of single units and recurrent connectivity, which is a fundamental step toward the description of biologically realistic neural networks. Author summary Biological neural networks are formed by a large number of neurons whose interactions can be extremely complex. Such systems have been successfully studied using random network models, in which the interactions among neurons are assumed to be random. However, the dynamics of single units are usually described using over-simplified models, which might not capture several salient features of real neurons. Here, we show how accounting for richer single-neuron dynamics results in shaping the network dynamics and determines which signals are better transmitted. We focus on adaptation, an important mechanism present in biological neurons that consists in the decrease of their firing rate in response to a sustained stimulus. Our mean-field approach reveals that the presence of adaptation shifts the network into a previously unreported dynamical regime, that we term resonant chaos, in which chaotic activity has a strong oscillatory component. Moreover, we show that this regime is advantageous for the transmission of low-frequency signals. Our work bridges the microscopic dynamics (single neurons) to the macroscopic dynamics (network), and shows how the global signal-transmission properties of the network can be controlled by acting on the single-neuron dynamics. These results paves the way for further developments that include more complex neural mechanisms, and considerably advance our understanding of realistic neural networks.

2019The way our brain learns to disentangle complex signals into unambiguous concepts is fascinating but remains largely unknown. There is evidence, however, that hierarchical neural representations play a key role in the cortex. This thesis investigates biologically plausible models of unsupervised learning of hierarchical representations as found in the brain and modern computer vision models. We use computational modeling to address three main questions at the intersection of artificial intelligence (AI) and computational neuroscience.The first question is: What are useful neural representations and when are deep hierarchical representations needed? We approach this point with a systematic study of biologically plausible unsupervised feature learning in a shallow 2-layer networks on digit (MNIST) and object (CIFAR10) classification. Surprisingly, random features support high performance, especially for large hidden layers. When combined with localized receptive fields, random feature networks approach the performance of supervised backpropagation on MNIST, but not on CIFAR10. We suggest that future models of biologically plausible learning should outperform such random feature benchmarks on MNIST, or that such models should be evaluated in different ways.The second question is: How can hierarchical representations be learned with mechanisms supported by neuroscientific evidence? We cover this question by proposing a unifying Hebbian model, inspired by common models of V1 simple and complex cells based on unsupervised sparse coding and temporal invariance learning. In shallow 2-layer networks, our model reproduces learning of simple and complex cell receptive fields, as found in V1. In deeper networks, we stack multiple layers of Hebbian learning but find that it does not yield hierarchical representations of increasing usefulness. From this, we hypothesise that standard Hebbian rules are too constrained to build increasingly useful representations, as observed in higher areas of the visual cortex or deep artificial neural networks.The third question is: Can AI inspire learning models that build deep representations and are still biologically plausible? We address this question by proposing a learning rule that takes inspiration from neuroscience and recent advances in self-supervised deep learning. The proposed rule is Hebbian, i.e. only depends on pre- and post-synaptic neuronal activity, but includes additional local factors, namely predictive dendritic input and widely broadcasted modulation factors. Algorithmically, this rule applies self-supervised contrastive predictive learning to a causal, biological setting using saccades. We find that networks trained with this generalised Hebbian rule build deep hierarchical representations of images, speech and video.We see our modeling as a potential starting point for both, new hypotheses, that can be tested experimentally, and novel AI models that could benefit from added biological realism.