Concept

Temporal difference learning

Related publications (32)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

One-shot learning and eligibility traces in sequential decision making

Marco Philipp Lehmann

When humans or animals perform an action that led to a desired outcome, they show a tendency to repeat it. The mechanisms underlying learning from past experience and adapting future behavior are still not fully understood. In this thesis, I study how huma ...

EPFL2018

Time course of prediction errors in sequential decision making

Michael Herzog, He Xu

In reinforcement learning, an agent makes sequential decisions to maximize reward. During learning, the actual and expected outcome are compared to tell whether a decision was good or bad. The difference between the actual outcome and expected outcome is t ...

2018

Model-based reinforcement learning and navigation in animals and machines

Dane Sterling Corneil

For decades, neuroscientists and psychologists have observed that animal performance on spatial navigation tasks suggests an internal learned map of the environment. More recently, map-based (or model-based) reinforcement learning has become a highly activ ...

EPFL2018

Neural Correlates of Sequential Decision Making

Michael Herzog, He Xu

In chess, a series of moves is made until a delayed sparse feedback (win, loss) is issued, which makes it impossible to evaluate the value of a single move. There are powerful reinforcement learning (RL) algorithms, which can cope with these sequential dec ...

2017

Neural Correlates of Sequential Decision Making

Michael Herzog, He Xu

2017

Evidence for eligibility traces in human learning

Michael Herzog, Wulfram Gerstner, Kerstin Preuschoff, Marco Philipp Lehmann, He Xu, Vasiliki Liakoni

Whether we prepare a coffee or navigate to a shop: in many tasks we make multiple decisions before reaching a goal. Learning such state-action sequences from sparse reward raises the problem of credit-assignment: which actions out of a long sequence should ...

arXiv2017

Dynamic Bayesian Networks for Student Modeling

Intelligent tutoring systems adapt the curriculum to the needs of the individual student. Therefore, an accurate representation and prediction of student knowledge is essential. Bayesian Knowledge Tracing (BKT) is a popular approach for student modeling. T ...

2017

Prospective Coding by Spiking Neurons

Johanni Michael Brea, Walter Senn

Animals learn to make predictions, such as associating the sound of a bell with upcoming feeding or predicting a movement that a motor command is eliciting. How predictions are realized on the neuronal level and what plasticity rule underlies their learnin ...

Public Library of Science2016

Learning to Track: Online Multi-object Tracking by Decision Making

Alexandre Massoud Alahi

Online Multi-Object Tracking (MOT) has wide applications in time-critical video analysis scenarios, such as robot navigation and autonomous driving. In tracking-by-detection, a major challenge of online MOT is how to robustly associate noisy object detecti ...

IEEE2015

Baseline frontostriatal-limbic connectivity predicts reward-based memory formation

Friedhelm Christoph Hummel

Reward mediates the acquisition and long-term retention of procedural skills in humans. Yet, learning under rewarded conditions is highly variable across individuals and the mechanisms that determine interindividual variability in rewarded learning are not ...

Wiley-Blackwell2014