Central pattern generatorCentral pattern generators (CPGs) are self-organizing biological neural circuits that produce rhythmic outputs in the absence of rhythmic input. They are the source of the tightly-coupled patterns of neural activity that drive rhythmic and stereotyped motor behaviors like walking, swimming, breathing, or chewing. The ability to function without input from higher brain areas still requires modulatory inputs, and their outputs are not fixed. Flexibility in response to sensory input is a fundamental quality of CPG-driven behavior.
Multi-agent reinforcement learningMulti-agent reinforcement learning (MARL) is a sub-field of reinforcement learning. It focuses on studying the behavior of multiple learning agents that coexist in a shared environment. Each agent is motivated by its own rewards, and does actions to advance its own interests; in some environments these interests are opposed to the interests of other agents, resulting in complex group dynamics. Multi-agent reinforcement learning is closely related to game theory and especially repeated games, as well as multi-agent systems.
Reinforcement learningReinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.
ProprioceptionProprioception (ˌproʊpri.oʊˈsɛpʃən,_-ə- ), also called kinaesthesia (or kinesthesia), is the sense of self-movement, force, and body position. Proprioception is mediated by proprioceptors, mechanosensory neurons located within muscles, tendons, and joints. Most animals possess multiple subtypes of proprioceptors, which detect distinct kinematic parameters, such as joint position, movement, and load. Although all mobile animals possess proprioceptors, the structure of the sensory organs can vary across species.
Deep reinforcement learningDeep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.
Reinforcement learning from human feedbackIn machine learning, reinforcement learning from human feedback (RLHF) or reinforcement learning from human preferences is a technique that trains a "reward model" directly from human feedback and uses the model as a reward function to optimize an agent's policy using reinforcement learning (RL) through an optimization algorithm like Proximal Policy Optimization. The reward model is trained in advance to the policy being optimized to predict if a given output is good (high reward) or bad (low reward).
OscillationOscillation is the repetitive or periodic variation, typically in time, of some measure about a central value (often a point of equilibrium) or between two or more different states. Familiar examples of oscillation include a swinging pendulum and alternating current. Oscillations can be used in physics to approximate complex interactions, such as those between atoms.
Deep learningDeep learning is part of a broader family of machine learning methods, which is based on artificial neural networks with representation learning. The adjective "deep" in deep learning refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.
Q-learningQ-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.
Sense of balanceThe sense of balance or equilibrioception is the perception of balance and spatial orientation. It helps prevent humans and nonhuman animals from falling over when standing or moving. Equilibrioception is the result of a number of sensory systems working together; the eyes (visual system), the inner ears (vestibular system), and the body's sense of where it is in space (proprioception) ideally need to be intact. The vestibular system, the region of the inner ear where three semicircular canals converge, works with the visual system to keep objects in focus when the head is moving.