Reinforcement learningReinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.
Q-learningQ-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.
DecentralizationDecentralization or decentralisation is the process by which the activities of an organization, particularly those regarding planning and decision-making, are distributed or delegated away from a central, authoritative location or group and given to smaller factions within it. Concepts of decentralization have been applied to group dynamics and management science in private businesses and organizations, political science, law and public administration, technology, economics and money.
Correlated equilibriumIn game theory, a correlated equilibrium is a solution concept that is more general than the well known Nash equilibrium. It was first discussed by mathematician Robert Aumann in 1974. The idea is that each player chooses their action according to their private observation of the value of the same public signal. A strategy assigns an action to every possible observation a player can make. If no player would want to deviate from their strategy (assuming the others also don't deviate), the distribution from which the signals are drawn is called a correlated equilibrium.
Agricultural cooperativeAn agricultural cooperative, also known as a farmers' co-op, is a producer cooperative in which farmers pool their resources in certain areas of activity. A broad typology of agricultural cooperatives distinguishes between agricultural service cooperatives, which provide various services to their individually-farming members, and agricultural production cooperatives in which production resources (land, machinery) are pooled and members farm jointly.
Agent-based modelAn agent-based model (ABM) is a computational model for simulating the actions and interactions of autonomous agents (both individual or collective entities such as organizations or groups) in order to understand the behavior of a system and what governs its outcomes. It combines elements of game theory, complex systems, emergence, computational sociology, multi-agent systems, and evolutionary programming. Monte Carlo methods are used to understand the stochasticity of these models.
Pseudo-polynomial timeIn computational complexity theory, a numeric algorithm runs in pseudo-polynomial time if its running time is a polynomial in the numeric value of the input (the largest integer present in the input)—but not necessarily in the length of the input (the number of bits required to represent it), which is the case for polynomial time algorithms. In general, the numeric value of the input is exponential in the input length, which is why a pseudo-polynomial time algorithm does not necessarily run in polynomial time with respect to the input length.
Polynomial-time reductionIn computational complexity theory, a polynomial-time reduction is a method for solving one problem using another. One shows that if a hypothetical subroutine solving the second problem exists, then the first problem can be solved by transforming or reducing it to inputs for the second problem and calling the subroutine one or more times. If both the time required to transform the first problem to the second, and the number of times the subroutine is called is polynomial, then the first problem is polynomial-time reducible to the second.
Evolutionarily stable strategyAn evolutionarily stable strategy (ESS) is a strategy (or set of strategies) that is impermeable when adopted by a population in adaptation to a specific environment, that is to say it cannot be displaced by an alternative strategy (or set of strategies) which may be novel or initially rare. Introduced by John Maynard Smith and George R. Price in 1972/3, it is an important concept in behavioural ecology, evolutionary psychology, mathematical game theory and economics, with applications in other fields such as anthropology, philosophy and political science.
Principal–agent problemThe principal–agent problem refers to the conflict in interests and priorities that arises when one person or entity (the "agent") takes actions on behalf of another person or entity (the "principal"). The problem worsens when there is a greater discrepancy of interests and information between the principal and agent, as well as when the principal lacks the means to punish the agent. The deviation from the principal's interest by the agent is called "agency costs".