Preference elicitation and inverse reinforcement learning
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
We consider the problem of learning by demonstration from agents acting in unknown stochastic Markov environments or games. Our aim is to estimate agent preferences in order to construct improved policies for the same task that the agents are trying to sol ...
Based on the 'universal' values of economic development, democratic governance and cultural diversity promoted by UNESCO, the official policy of the Federation of Malaysia, known as Wawasan 2020 (Vision 2020), promotes modernization with an emphasis on dem ...
Milestones in sparse signal reconstruction and compressive sensing can be understood in a probabilistic Bayesian context, fusing underdetermined measurements with knowledge about low level signal properties in the posterior distribution, which is maximized ...
Institute of Electrical and Electronics Engineers2010
We generalise the problem of inverse reinforcement learning to multiple tasks, from multiple demonstrations. Each one may represent one expert trying to solve a different task, or as different experts trying to solve the same task. Our main contribution is ...
In this paper, we derive a probabilistic registration algorithm for object modeling and tracking. In many robotics applications, such as manipulation tasks, nonvisual information about the movement of the object is available, which we will combine with the ...
In the Bayesian approach to sequential decision making, exact calculation of the (subjective) utility is intractable. This extends to most special cases of interest, such as reinforcement learning problems. While utility bounds are known to exist for this ...
This paper introduces a simple, general framework for likelihood-free Bayesian reinforcement learning, through Approximate Bayesian Computation (ABC). The main advantage is that we only require a prior distribution on a class of simulators (generative mode ...
Purpose – The purpose of this paper is to focus on the distinction between smart specialisation and smart specialisation policy and it studies under what conditions a smart specialisation policy is necessary. Design/methodology/approach – A conceptual fram ...
This paper considers the task of finding a target location by making a limited number of sequential observations. Each observation results from evaluating an imperfect classifier of a chosen cost and accuracy on an interval of chosen length and position. W ...
Markov decision processes (MDPs) are powerful tools for decision making in uncertain dynamic environments. However, the solutions of MDPs are of limited practical use because of their sensitivity to distributional model parameters, which are typically unkn ...