Deep reinforcement learningDeep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.
Q-learningQ-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.
Animal cognitionAnimal cognition encompasses the mental capacities of non-human animals including insect cognition. The study of animal conditioning and learning used in this field was developed from comparative psychology. It has also been strongly influenced by research in ethology, behavioral ecology, and evolutionary psychology; the alternative name cognitive ethology is sometimes used. Many behaviors associated with the term animal intelligence are also subsumed within animal cognition.
LearningLearning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machines; there is also evidence for some kind of learning in certain plants. Some learning is immediate, induced by a single event (e.g. being burned by a hot stove), but much skill and knowledge accumulate from repeated experiences. The changes induced by learning often last a lifetime, and it is hard to distinguish learned material that seems to be "lost" from that which cannot be retrieved.
Reinforcement learningReinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.
Inbred strainInbred strains (also called inbred lines, or rarely for animals linear animals) are individuals of a particular species which are nearly identical to each other in genotype due to long inbreeding. A strain is inbred when it has undergone at least 20 generations of brother x sister or offspring x parent mating, at which point at least 98.6% of the loci in an individual of the strain will be homozygous, and each individual can be treated effectively as clones.
Long-term memoryLong-term memory (LTM) is the stage of the Atkinson–Shiffrin memory model in which informative knowledge is held indefinitely. It is defined in contrast to short-term and working memory, which persist for only about 18 to 30 seconds. LTM is commonly labelled as "explicit memory" (declarative), as well as "episodic memory," "semantic memory," "autobiographical memory," and "implicit memory" (procedural memory). The idea of separate memories for short- and long-term storage originated in the 19th century.
Observational learningObservational learning is learning that occurs through observing the behavior of others. It is a form of social learning which takes various forms, based on various processes. In humans, this form of learning seems to not need reinforcement to occur, but instead, requires a social model such as a parent, sibling, friend, or teacher with surroundings. Particularly in childhood, a model is someone of authority or higher status in an environment.
Short-term memoryShort-term memory (or "primary" or "active memory") is the capacity for holding a small amount of information in an active, readily available state for a short interval. For example, short-term memory holds a phone number that has just been recited. The duration of short-term memory (absent rehearsal or active maintenance) is estimated to be on the order of seconds. The commonly cited capacity of 7 items, found in Miller's Law, has been superseded by 4±1 items. In contrast, long-term memory holds information indefinitely.
Learning rateIn machine learning and statistics, the learning rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function. Since it influences to what extent newly acquired information overrides old information, it metaphorically represents the speed at which a machine learning model "learns". In the adaptive control literature, the learning rate is commonly referred to as gain. In setting a learning rate, there is a trade-off between the rate of convergence and overshooting.