Distributed Value-Function Learning with Linear Convergence Rates
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Machine learning is often cited as a new paradigm in control theory, but is also often viewed as empirical and less intuitive for students than classical model-based methods. This is particularly the case for reinforcement learning, an approach that does n ...
Public policy evaluation has traditionally been conducted in a context constrained by legal, institutional and political red tape. The arrival of digitalisation adds a new layer to public policy evaluation processes. Assessing the quality of technologies, ...
This article develops a fully decentralized multiagent algorithm for policy evaluation. The proposed scheme can be applied to two distinct scenarios. In the first scenario, a collection of agents have distinct datasets gathered by following different behav ...
Policymaking is a complex process that has been studied using policy process theories almost exclusively. These theories have been built using a large number of qualitative cases. Such methods are useful for theory building but remain limited for theory ex ...
We address the challenge of learning factored policies in cooperative MARL scenarios. In particular, we consider the situation in which a team of agents collaborates to optimize a common cost. The goal is to obtain factored policies that determine the indi ...
Learning how to act and adapting to unexpected changes are remarkable capabilities of humans and other animals. In the absence of a direct recipe to follow in life, behaviour is often guided by rewarding and by surprising events. A positive or a negative o ...
This paper investigates the effect of combination policies on the performance of adaptive social learning in non-stationary environments. By analyzing the relation between the error probability and the underlying graph topology, we prove that in the slow a ...
Over the past decades, the world-leading Green Certification Protocols have been paid increasing attention to health-related aspects of buildings. However, the way and the extent to which green certifications currently account for these aspects vary largel ...
We consider a learning system based on the conventional multiplicative weight ( MW) rule that combines experts' advice to predict a sequence of true outcomes. It is assumed that one of the experts is malicious and aims to impose the maximum loss on the sys ...
Deep Reinforcement Learning (DRL) recently emerged as a possibility to control complex systems without the need to model them. However, since weeks long experiments are needed to assess the performance of a building controller, people still have to rely on ...