Robust Bayesian reinforcement learning through tight lower bounds
Related publications (62)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Machine learning is often cited as a new paradigm in control theory, but is also often viewed as empirical and less intuitive for students than classical model-based methods. This is particularly the case for reinforcement learning, an approach that does n ...
In this master thesis, multi-agent reinforcement learning is used to teach robots to build a self-supporting structure connecting two points. To accomplish this task, a physics simulator is first designed using linear programming. Then, the task of buildin ...
Finding optimal bidding strategies for generation units in electricity markets would result in higher profit. However, it is a challenging problem due to the system uncertainty which is due to the lack of knowledge of the strategies of other generation uni ...
This letter, addressed to a creature taking the form of a human chimera gathering the thoughts and knowledge of people who inspire and accompany us, recounts the experiences, affects and issues related to our first semester of teaching the course named DRA ...
We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this an ...
Artificial intelligence, particularly the subfield of machine learning, has seen a paradigm shift towards data-driven models that learn from and adapt to data. This has resulted in unprecedented advancements in various domains such as natural language proc ...
Cities are increasingly reusing industrial heritage as part of cultural and creative regeneration strategies. However, designers and decision-makers face the challenge of determining which features and elements of industrial heritage are more perceived and ...
This article investigates the optimal containment control problem for a class of heterogeneous multi-agent systems with time-varying actuator faults and unmatched disturbances based on adaptive dynamic programming. Since there exist unknown input signals i ...
Learning-based outlier (mismatched correspondence) rejection for robust 3D registration generally formulates the outlier removal as an inlier/outlier classification problem. The core for this to be successful is to learn the discriminative inlier/outlier f ...
Model-free Reinforcement Learning (RL) generally suffers from poor sample complexity, mostly due to the need to exhaustively explore the state-action space to find well-performing policies. On the other hand, we postulate that expert knowledge of the syste ...