**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Publication# Graph Exploration for Effective Multiagent Q-Learning

Abstract

This article proposes an exploration technique for multiagent reinforcement learning (MARL) with graph-based communication among agents. We assume that the individual rewards received by the agents are independent of the actions by the other agents, while their policies are coupled. In the proposed framework, neighboring agents collaborate to estimate the uncertainty about the state-action space in order to execute more efficient explorative behavior. Different from existing works, the proposed algorithm does not require counting mechanisms and can be applied to continuous-state environments without requiring complex conversion techniques. Moreover, the proposed scheme allows agents to communicate in a fully decentralized manner with minimal information exchange. And for continuous-state scenarios, each agent needs to exchange only a single parameter vector. The performance of the algorithm is verified with theoretical results for discrete-state scenarios and with experiments for the continuous ones.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts (32)

Related publications (32)

Graph coloring

In graph theory, graph coloring is a special case of graph labeling; it is an assignment of labels traditionally called "colors" to elements of a graph subject to certain constraints. In its simplest form, it is a way of coloring the vertices of a graph such that no two adjacent vertices are of the same color; this is called a vertex coloring. Similarly, an edge coloring assigns a color to each edge so that no two adjacent edges are of the same color, and a face coloring of a planar graph assigns a color to each face or region so that no two faces that share a boundary have the same color.

Graph theory

In mathematics, graph theory is the study of graphs, which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of vertices (also called nodes or points) which are connected by edges (also called links or lines). A distinction is made between undirected graphs, where edges link two vertices symmetrically, and directed graphs, where edges link two vertices asymmetrically. Graphs are one of the principal objects of study in discrete mathematics.

Graph (abstract data type)

In computer science, a graph is an abstract data type that is meant to implement the undirected graph and directed graph concepts from the field of graph theory within mathematics. A graph data structure consists of a finite (and possibly mutable) set of vertices (also called nodes or points), together with a set of unordered pairs of these vertices for an undirected graph or a set of ordered pairs for a directed graph. These pairs are known as edges (also called links or lines), and for a directed graph are also known as edges but also sometimes arrows or arcs.

We examine the connection of two graph parameters, the size of a minimum feedback arcs set and the acyclic disconnection. A feedback arc set of a directed graph is a subset of arcs such that after deletion the graph becomes acyclic. The acyclic disconnecti ...

Graph machine learning offers a powerful framework with natural applications in scientific fields such as chemistry, biology and material sciences. By representing data as a graph, we encode the prior knowledge that the data is composed of a set of entitie ...

Martin Jaggi, Thijs Vogels, Hadrien Hendrikx

In data-parallel optimization of machine learning models, workers collaborate to improve their estimates of the model: more accurate gradients allow them to use larger learning rates and optimize faster. In the decentralized setting, in which workers commu ...