Multi-agent reinforcement learning (MARL) is a sub-field of reinforcement learning. It focuses on studying the behavior of multiple learning agents that coexist in a shared environment. Each agent is motivated by its own rewards, and does actions to advance its own interests; in some environments these interests are opposed to the interests of other agents, resulting in complex group dynamics.
Multi-agent reinforcement learning is closely related to game theory and especially repeated games, as well as multi-agent systems. Its study combines the pursuit of finding ideal algorithms that maximize rewards with a more sociological set of concepts. While research in single-agent reinforcement learning is concerned with finding the algorithm that gets the biggest number of points for one agent, research in multi-agent reinforcement learning evaluates and quantifies social metrics, such as cooperation, reciprocity, equity, social influence, language and discrimination.
Similarly to single-agent reinforcement learning, multi-agent reinforcement learning is modeled as some form of a Markov decision process (MDP). For example,
A set of environment states.
One set of actions for each of the agents .
is the probability of transition (at time ) from state to state under joint action .
is the immediate joint reward after transition from to with joint action .
In settings with perfect information, such as the games of chess and Go, the MDP would be fully observable. In settings with imperfect information, especially in real-world applications like self-driving cars, each agent would access an observation that only has part of the information about the current state. In the partially observable setting, the core model is the partially observable stochastic game in the general case, and the decentralized POMDP in the cooperative case.
When multiple agents are acting in a shared environment their interests might be aligned or misaligned.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Software agents are widely used to control physical, economic and financial processes. The course presents practical methods for implementing software agents and multi-agent systems, supported by prog
Since 2010 approaches in deep learning have revolutionized fields as diverse as computer vision, machine learning, or artificial intelligence. This course gives a systematic introduction into influent
This course provides the students with 1) a set of theoretical concepts to understand the machine learning approach; and 2) a subset of the tools to use this approach for problems arising in mechanica
AlphaGo is a computer program that plays the board game Go. It was developed by the London-based DeepMind Technologies, an acquired subsidiary of Google (now Alphabet Inc.). Subsequent versions of AlphaGo became increasingly powerful, including a version that competed under the name Master. After retiring from competitive play, AlphaGo Master was succeeded by an even more powerful version known as AlphaGo Zero, which was completely self-taught without learning from human games.
DeepMind Technologies Limited, doing business as Google DeepMind, is a British-American artificial intelligence research laboratory which serves as a subsidiary of Google. Founded in the UK in 2010, it was acquired by Google in 2014, becoming a wholly owned subsidiary of Google parent company Alphabet Inc. after Google's corporate restructuring in 2015. The company is based in London, with research centres in Canada, France, and the United States.
Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.
At the same time, several different tutorials on available data and data tools, such as those from the Allen Institute for Brain Science, provide you with in-depth knowledge on brain atlases, gene exp
The MOOC on Neuro-robotics focuses on teaching advanced learners to design and construct a virtual robot and test its performance in a simulation using the HBP robotics platform. Learners will learn t
The MOOC on Neuro-robotics focuses on teaching advanced learners to design and construct a virtual robot and test its performance in a simulation using the HBP robotics platform. Learners will learn t
Explores coordination and learning in distributed multiagent systems, covering social laws, task exchange, constraint satisfaction, and coordination algorithms.
The ability to reason, plan and solve highly abstract problems is a hallmark of human intelligence. Recent advancements in artificial intelligence, propelled by deep neural networks, have revolutionized disciplines like computer vision and natural language ...
Buildings play a pivotal role in the ongoing worldwide energy transition, accounting for 30% of the global energy consumption. With traditional engineering solutions reaching their limits to tackle such large-scale problems, data-driven methods and Machine ...
The combination of several interesting characteristics makes metal-organic frameworks (MOFs) a highly sought-after class of nanomaterials for a broad range of applications like gas storage and separation, catalysis, drug delivery, and so on. However, the e ...