Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
In the Bayesian approach to sequential decision making, exact calculation of the (subjective) utility is intractable. This extends to most special cases of interest, such as reinforcement learning problems. While utility bounds are known to exist for this ...
How do animals learn to repeat behaviors that lead to the obtention of food or other “rewarding” objects? As a biologically plausible paradigm for learning in spiking neural networks, spike-timing dependent plasticity (STDP) has been shown to perform well ...
Perceptual learning improves perception through training. Perceptual learning improves with most stimulus types but fails when certain stimulus types are mixed during training (roving). This result is surprising because classical supervised and unsupervise ...
Direct transfer of human motion trajectories to humanoid robots does not result in dynamically stable robot movements due to the differences in human and humanoid robot kinematics and dynamics. We developed a system that converts human movements captured b ...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only been partially elucidated. On one hand, experimental evidence shows that the neuromodulator dopamine carries information about rewards and affects synaptic pla ...
Perceptual learning improves with most basic stimuli. Interestingly, performance does not improve when stimuli of two types are randomly presented during training (roving). For example, there is no perceptual learning when left or right bisection stimuli w ...
Mechanisms on automatic discovery of macro actions or skills in reinforcement learning methods are mainly focused on subgoal discovery methods. Among the proposed algorithms, those based on graph centrality measures demonstrate a high performance gain. In ...
In this paper we describe a new computational model of switching between path-planning and cue-guided navigation strategies. It is based on three main assumptions: (i) the strategies are mediated by separate memory systems that learn independently and in p ...
This paper considers the issues of efficiency and autonomy that are required to make reinforcement learning suitable for real-life control tasks. A real-time reinforcement learning algorithm is presented that repeatedly adjusts the control policy with the ...
Reward mediates the acquisition and long-term retention of procedural skills in humans. Yet, learning under rewarded conditions is highly variable across individuals and the mechanisms that determine interindividual variability in rewarded learning are not ...