Multi-agent reinforcement learning

Multi-agent reinforcement learning (MARL) is a sub-field of reinforcement learning. It focuses on studying the behavior of multiple learning agents that coexist in a shared environment. Each agent is motivated by its own rewards, and does actions to advance its own interests; in some environments these interests are opposed to the interests of other agents, resulting in complex group dynamics. Multi-agent reinforcement learning is closely related to game theory and especially repeated games, as well as multi-agent systems. Its study combines the pursuit of finding ideal algorithms that maximize rewards with a more sociological set of concepts. While research in single-agent reinforcement learning is concerned with finding the algorithm that gets the biggest number of points for one agent, research in multi-agent reinforcement learning evaluates and quantifies social metrics, such as cooperation, reciprocity, equity, social influence, language and discrimination. Similarly to single-agent reinforcement learning, multi-agent reinforcement learning is modeled as some form of a Markov decision process (MDP). For example, A set of environment states. One set of actions for each of the agents . is the probability of transition (at time ) from state to state under joint action . is the immediate joint reward after transition from to with joint action . In settings with perfect information, such as the games of chess and Go, the MDP would be fully observable. In settings with imperfect information, especially in real-world applications like self-driving cars, each agent would access an observation that only has part of the information about the current state. In the partially observable setting, the core model is the partially observable stochastic game in the general case, and the decentralized POMDP in the cooperative case. When multiple agents are acting in a shared environment their interests might be aligned or misaligned.

Multi-agent reinforcement learning

Graph Chatbot

Chat with Graph Search

Fusing Pre-existing Knowledge and Machine Learning for Enhanced Building Thermal Modeling and Control

Inverse design of metal-organic frameworks for direct air capture of CO2via deep reinforcement learning

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Fusing Pre-existing Knowledge and Machine Learning for Enhanced Building Thermal Modeling and Control

Inverse design of metal-organic frameworks for direct air capture of CO2via deep reinforcement learning