Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
To achieve an optimal outcome in many situations, agents need to choose distinct actions from one another. This is the case notably in many resource allocation problems, where a single resource can only be used by one agent at a time. How shall a designer ...
This paper proposes an online tree-based Bayesian approach for reinforcement learning. For inference, we employ a generalised context tree model. This defines a distribution on multivariate Gaussian piecewise-linear models, which can be updated in closed f ...
In this paper we describe a new computational model of switching between path-planning and cue-guided navigation strategies. It is based on three main assumptions: (i) the strategies are mediated by separate memory systems that learn independently and in p ...
Acute stress regulates different aspects of behavioral learning through the action of stress hormones and neuromodulators. Stress effects depend on stressor's type, intensity, timing, and the learning paradigm. In addition, genetic background of animals mi ...
How do animals learn to repeat behaviors that lead to the obtention of food or other “rewarding” objects? As a biologically plausible paradigm for learning in spiking neural networks, spike-timing dependent plasticity (STDP) has been shown to perform well ...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only been partially elucidated. On one hand, experimental evidence shows that the neuromodulator dopamine carries information about rewards and affects synaptic pla ...
We introduce a class of learning problems where the agent is presented with a series of tasks. Intuitively, if there is a relation among those tasks, then the information gained during execution of one task has value for the execution of another task. Cons ...
Many games have undesirable Nash equilibria. For exam- ple consider a resource allocation game in which two players compete for an exclusive access to a single resource. It has three Nash equilibria. The two pure-strategy NE are effi- cient, but not fair. ...
One difficulty with the Swiss dual system is the gap between the practical work in the company and the theoretical teaching at school. In this article, we examine the case of carpenters. We observe that the school-workplace gap exists and materializes thro ...
An agent that deviates from a usual or previous course of action can be said to display novel or varying behaviour. Novelty of behaviour can be seen as the result of real or apparent randomness in decision making, which prevents an agent from repeating exa ...