Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In chess, a series of moves is made until a delayed sparse feedback (win, loss) is issued, which makes it impossible to evaluate the value of a single move. There are powerful reinforcement learning (RL) algorithms, which can cope with these sequential decision making situations. A crucial component in these algorithms is the reward prediction error (RPE), which measures the difference between the actual reward and the predicted reward. Here, we show that the RPE is well reflected in a frontal negativity of the EEG. Participants saw an image on the computer screen and were asked to click one out of, for example, four buttons, which, depending on the choice, led to the presentation of a new image until a goal state was reached. 128-channel EEG was recorded. To estimate the RPE, we fit participants' performance to state of the art RL algorithms. We chose the best fitting algorithm to estimate the RPE. Two time windows (170-286ms and 400-580ms) in the event-related potential (ERP) reflected well the magnitudes of the RPEs of this algorithm. The late time window ERP magnitude was highly correlated with the RPEs (r2=0.14, p
Michael Herzog, Simona Adele Garobbio, Maya Roinishvili, Ophélie Gladys Favrod
Michael Herzog, Maya Roinishvili, Ophélie Gladys Favrod, Patricia Figueiredo, Albulena Jashari-Shaqiri