Adding prediction risk to the theory of reward learning
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Everybody knows what it feels to be surprised. Surprise raises our attention and is crucial for learning. It is a ubiquitous concept whose traces have been found in both neuroscience and machine learning. However, a comprehensive theory has not yet been de ...
How do animals learn to repeat behaviors that lead to the obtention of food or other “rewarding” objects? As a biologically plausible paradigm for learning in spiking neural networks, spike-timing dependent plasticity (STDP) has been shown to perform well ...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only been partially elucidated. On one hand, experimental evidence shows that the neuromodulator dopamine carries information about rewards and affects synaptic pla ...
Imaging systems can be designed using examples and methods similar to the techniques used in deep learning. We describe experimental results demonstrating optical tomography based on the learning approach. ...
Is it possible to teach workers while crowdsourcing classification tasks? Amongst the challenges: (a) workers have different (unknown) skills, competence, and learning rate to which the teaching must be adapted, (b) feedback on the workers’ progress is lim ...
We revisit a recently developed iterative learning algorithm that enables systems to learn from a repeated operation with the goal of achieving high tracking performance of a given trajectory. The learning scheme is based on a coarse dynamics model of the ...
One difficulty with the Swiss dual system is the gap between the practical work in the company and the theoretical teaching at school. In this article, we examine the case of carpenters. We observe that the school-workplace gap exists and materializes thro ...
Recent experiments have shown that spike-timing-dependent plasticity is influenced by neuromodulation. We derive theoretical conditions for successful learning of reward-related behavior for a large class of learning rules where Hebbian synaptic plasticity ...
Reinforcement learning algorithms have been successfully applied in robotics to learn how to solve tasks based on reward signals obtained during task execution. These reward signals are usually modeled by the programmer or provided by supervision. However, ...
In this paper we describe a new computational model of switching between path-planning and cue-guided navigation strategies. It is based on three main assumptions: (i) the strategies are mediated by separate memory systems that learn independently and in p ...