Publication

Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail

Abstract

Changes of synaptic connections between neurons are thought to be the physiological basis of learning. These changes can be gated by neuromodulators that encode the presence of reward. We study a family of reward-modulated synaptic learning rules for spiking neurons on a learning task in continuous space inspired by the Morris Water maze. The synaptic update rule modifies the release probability of synaptic transmission and depends on the timing of presynaptic spike arrival, postsynaptic action potentials, as well as the membrane potential of the postsynaptic neuron. The family of learning rules includes an optimal rule derived from policy gradient methods as well as reward modulated Hebbian learning. The synaptic update rule is implemented in a population of spiking neurons using a network architecture that combines feedforward input with lateral connections. Actions are represented by a population of hypothetical action cells with strong mexican-hat connectivity and are read out at theta frequency. We show that in this architecture, a standard policy gradient rule fails to solve the Morris watermaze task, whereas a variant with a Hebbian bias can learn the task within 20 trials, consistent with experiments. This result does not depend on implementation details such as the size of the neuronal populations. Our theoretical approach shows how learning new behaviors can be linked to reward-modulated plasticity at the level of single synapses and makes predictions about the voltage and spike-timing dependence of synaptic plasticity and the influence of neuromodulators such as dopamine. It is an important step towards connecting formal theories of reinforcement learning with neuronal and synaptic properties

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (33)
Chemical synapse
Chemical synapses are biological junctions through which neurons' signals can be sent to each other and to non-neuronal cells such as those in muscles or glands. Chemical synapses allow neurons to form circuits within the central nervous system. They are crucial to the biological computations that underlie perception and thought. They allow the nervous system to connect to and control other systems of the body. At a chemical synapse, one neuron releases neurotransmitter molecules into a small space (the synaptic cleft) that is adjacent to another neuron.
Action potential
An action potential occurs when the membrane potential of a specific cell rapidly rises and falls. This depolarization then causes adjacent locations to similarly depolarize. Action potentials occur in several types of animal cells, called excitable cells, which include neurons, muscle cells, and in some plant cells. Certain endocrine cells such as pancreatic beta cells, and certain cells of the anterior pituitary gland are also excitable cells.
Synaptic plasticity
In neuroscience, synaptic plasticity is the ability of synapses to strengthen or weaken over time, in response to increases or decreases in their activity. Since memories are postulated to be represented by vastly interconnected neural circuits in the brain, synaptic plasticity is one of the important neurochemical foundations of learning and memory (see Hebbian theory). Plastic change often results from the alteration of the number of neurotransmitter receptors located on a synapse.
Show more
Related publications (139)

Long-term plasticity induces sparse and specific synaptic changes in a biophysically detailed cortical model

Eilif Benjamin Muller, Michael Reimann, James Gonzalo King, Marwan Muhammad Ahmed Abdellah, Pramod Shivaji Kumbhar, András Ecker, Sirio Bolaños Puchet, James Bryden Isbister, Daniela Egas Santander, Jorge Blanco Alonso, Giuseppe Chindemi, Ioannis Magkanaris

Synaptic plasticity underlies the brain’s ability to learn and adapt. This process is often studied in small groups of neurons in vitro or indirectly through its effects on behavior in vivo. Due to the limitations of available experimental techniques, inve ...
2023

Principles of Network Plasticity in Neocortical Microcircuits

András Ecker

Synaptic plasticity underlies our ability to learn and adapt to the constantly changing environment. The phenomenon of synapses changing their efficacy in an activity-dependent manner is often studied in small groups of neurons in vitro or indirectly throu ...
EPFL2023

Striatal Dopamine Signals and Reward Learning

Carl Petersen, Sylvain Crochet, Yanqi Liu, Parviz Ghaderi, Mauro Pulin, Anthony Pierre Robert Renard, Christos Sourmpis, Pol Bech Vilaseca, Meriam Malekzadeh, Robin François Virginien Dard

We are constantly bombarded by sensory information and constantly making decisions on how to act. In order to optimally adapt behavior, we must judge which sequences of sensory inputs and actions lead to successful outcomes in specific circumstances. Neuro ...
Oxford2023
Show more
Related MOOCs (25)
Simulation Neurocience
Learn how to digitally reconstruct a single neuron to better study the biological mechanisms of brain function, behaviour and disease.
Simulation Neurocience
Learn how to digitally reconstruct a single neuron to better study the biological mechanisms of brain function, behaviour and disease.
Simulation Neurocience
Learn how to digitally reconstruct a single neuron to better study the biological mechanisms of brain function, behaviour and disease.
Show more