Publication

Logical Team Q-learning: An approach towards factored policies in cooperative MARL

Related concepts (23)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Software agent

In computer science, a software agent or software AI is a computer program that acts for a user or other program in a relationship of agency, which derives from the Latin agere (to do): an agreement to act on one's behalf. Such "action on behalf of" implies the authority to decide which, if any, action is appropriate. Some agents are colloquially known as bots, from robot. They may be embodied, as when execution is paired with a robot body, or as software such as a chatbot executing on a phone (e.g.

Behavior

Behavior (American English) or behaviour (British English) is the range of actions and mannerisms made by individuals, organisms, systems or artificial entities in some environment. These systems can include other systems or organisms as well as the inanimate physical environment. It is the computed response of the system or organism to various stimuli or inputs, whether internal or external, conscious or subconscious, overt or covert, and voluntary or involuntary.

Deep reinforcement learning

Deep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.

Evidence-based policy

Evidence-based policy is a concept in public policy that advocates for policy decisions to be grounded on, or influenced by, rigorously established objective evidence. This concept presents a stark contrast to policymaking predicated on ideology, 'common sense,' anecdotes, or personal intuitions. The approach mirrors the effective altruism movement's philosophy within governmental circles. The methodology employed in evidence-based policy often includes comprehensive research methods such as randomized controlled trials (RCT).

Stochastic gradient descent

Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient (calculated from the entire data set) by an estimate thereof (calculated from a randomly selected subset of the data).

Cost–utility analysis

Cost–utility analysis (CUA) is a form of economic analysis used to guide procurement decisions. The most common and well-known application of this analysis is in pharmacoeconomics, especially health technology assessment (HTA). In health economics, the purpose of CUA is to estimate the ratio between the cost of a health-related intervention and the benefit it produces in terms of the number of years lived in full health by the beneficiaries. Hence it can be considered a special case of cost-effectiveness analysis, and the two terms are often used interchangeably.

Active learning (machine learning)

Active learning is a special case of machine learning in which a learning algorithm can interactively query a user (or some other information source) to label new data points with the desired outputs. In statistics literature, it is sometimes also called optimal experimental design. The information source is also called teacher or oracle. There are situations in which unlabeled data is abundant but manual labeling is expensive. In such a scenario, learning algorithms can actively query the user/teacher for labels.

Cost–benefit analysis

Cost–benefit analysis (CBA), sometimes also called benefit–cost analysis, is a systematic approach to estimating the strengths and weaknesses of alternatives. It is used to determine options which provide the best approach to achieving benefits while preserving savings in, for example, transactions, activities, and functional business requirements. A CBA may be used to compare completed or potential courses of action, and to estimate or evaluate the value against the cost of a decision, project, or policy.

Organizational behavior

Organizational behavior or organisational behaviour (see spelling differences) is the: "study of human behavior in organizational settings, the interface between human behavior and the organization, and the organization itself". Organizational behavioral research can be categorized in at least three ways: individuals in organizations (micro-level) work groups (meso-level) how organizations behave (macro-level) Chester Barnard recognized that individuals behave differently when acting in their organizational role than when acting separately from the organization.

Stochastic optimization

Stochastic optimization (SO) methods are optimization methods that generate and use random variables. For stochastic problems, the random variables appear in the formulation of the optimization problem itself, which involves random objective functions or random constraints. Stochastic optimization methods also include methods with random iterates. Some stochastic optimization methods use random iterates to solve stochastic problems, combining both meanings of stochastic optimization.