No-Regret Learning from Partially Observed Data in Repeated Auctions
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
In this paper we propose an unbiased Monte Carlo maximum likelihood estimator for discretely observed Wright-Fisher diffusions. Our approach is based on exact simulation techniques that are of special interest for diffusion processes defined on a bounded d ...
A multi-agent system consists of a collection of decision-making or learning agents subjected to streaming observations from some real-world phenomenon. The goal of the system is to solve some global learning or optimization problem in a distributed or dec ...
Mechanism design theory examines the design of allocation mechanisms or incentive systems involving multiple rational but self-interested agents and plays a central role in many societally important problems in economics. In mechanism design problems, agen ...
Fueled by recent advances in deep neural networks, reinforcement learning (RL) has been in the limelight because of many recent breakthroughs in artificial intelligence, including defeating humans in games (e.g., chess, Go, StarCraft), self-driving cars, s ...
Finding optimal bidding strategies for generation units in electricity markets would result in higher profit. However, it is a challenging problem due to the system uncertainty which is due to the lack of knowledge of the strategies of other generation uni ...
In this master thesis, multi-agent reinforcement learning is used to teach robots to build a self-supporting structure connecting two points. To accomplish this task, a physics simulator is first designed using linear programming. Then, the task of buildin ...
We study the problem of drift estimation for two-scale continuous time series. We set ourselves in the framework of overdamped Langevin equations, for which a single-scale surrogate homogenized equation exists. In this setting, estimating the drift coeffic ...
Many decision problems in science, engineering, and economics are affected by uncertainty, which is typically modeled by a random variable governed by an unknown probability distribution. For many practical applications, the probability distribution is onl ...
We examine the problem of regret minimization when the learner is involved in a continuous game with other optimizing agents: in this case, if all players follow a no-regret algorithm, it is possible to achieve significantly lower regret relative to fully ...
We study the problem of drift estimation for two-scale continuous time series. We set ourselves in the framework of overdamped Langevin equations, for which a single-scale surrogate homogenized equation exists. In this setting, estimating the drift coeffic ...