Distributed Value-Function Learning with Linear Convergence Rates
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Public policy evaluation has traditionally been conducted in a context constrained by legal, institutional and political red tape. The arrival of digitalisation adds a new layer to public policy evaluation processes. Assessing the quality of technologies, ...
Machine learning is often cited as a new paradigm in control theory, but is also often viewed as empirical and less intuitive for students than classical model-based methods. This is particularly the case for reinforcement learning, an approach that does n ...
This paper investigates the effect of combination policies on the performance of adaptive social learning in non-stationary environments. By analyzing the relation between the error probability and the underlying graph topology, we prove that in the slow a ...
Over the past decades, the world-leading Green Certification Protocols have been paid increasing attention to health-related aspects of buildings. However, the way and the extent to which green certifications currently account for these aspects vary largel ...
This article develops a fully decentralized multiagent algorithm for policy evaluation. The proposed scheme can be applied to two distinct scenarios. In the first scenario, a collection of agents have distinct datasets gathered by following different behav ...
Policymaking is a complex process that has been studied using policy process theories almost exclusively. These theories have been built using a large number of qualitative cases. Such methods are useful for theory building but remain limited for theory ex ...
We consider a learning system based on the conventional multiplicative weight ( MW) rule that combines experts' advice to predict a sequence of true outcomes. It is assumed that one of the experts is malicious and aims to impose the maximum loss on the sys ...
Deep Reinforcement Learning (DRL) recently emerged as a possibility to control complex systems without the need to model them. However, since weeks long experiments are needed to assess the performance of a building controller, people still have to rely on ...
We address the challenge of learning factored policies in cooperative MARL scenarios. In particular, we consider the situation in which a team of agents collaborates to optimize a common cost. The goal is to obtain factored policies that determine the indi ...
Learning how to act and adapting to unexpected changes are remarkable capabilities of humans and other animals. In the absence of a direct recipe to follow in life, behaviour is often guided by rewarding and by surprising events. A positive or a negative o ...