Q-learningQ-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.
Types of artificial neural networksThere are many types of artificial neural networks (ANN). Artificial neural networks are computational models inspired by biological neural networks, and are used to approximate functions that are generally unknown. Particularly, they are inspired by the behaviour of neurons and the electrical signals they convey between input (such as from the eyes or nerve endings in the hand), processing, and output from the brain (such as reacting to light, touch, or heat). The way neurons semantically communicate is an area of ongoing research.
Delta ruleIn machine learning, the delta rule is a gradient descent learning rule for updating the weights of the inputs to artificial neurons in a single-layer neural network. It is a special case of the more general backpropagation algorithm. For a neuron with activation function , the delta rule for neuron 's th weight is given by where It holds that and . The delta rule is commonly stated in simplified form for a neuron with a linear activation function as While the delta rule is similar to the perceptron's update rule, the derivation is different.
Hyperbolic discountingIn economics, hyperbolic discounting is a time-inconsistent model of delay discounting. It is one of the cornerstones of behavioral economics and its brain-basis is actively being studied by neuroeconomics researchers. According to the discounted utility approach, intertemporal choices are no different from other choices, except that some consequences are delayed and hence must be anticipated and discounted (i.e., reweighted to take into account the delay). Given two similar rewards, humans show a preference for one that arrives sooner rather than later.
Interference theoryThe interference theory is a theory regarding human memory. Interference occurs in learning. The notion is that memories encoded in long-term memory (LTM) are forgotten and cannot be retrieved into short-term memory (STM) because either memory could interfere with the other. There is an immense number of encoded memories within the storage of LTM. The challenge for memory retrieval is recalling the specific memory and working in the temporary workspace provided in STM.
Delayed gratificationDelayed gratification, or deferred gratification, is the resistance to the temptation of an immediate pleasure in the hope of obtaining a valuable and long-lasting reward in the long-term. In other words, delayed gratification describes the process that the subject undergoes when the subject resists the temptation of an immediate reward in preference for a later reward. Generally, delayed gratification is associated with resisting a smaller but more immediate reward in order to receive a larger or more enduring reward later.
ProcrastinationProcrastination is the act of unnecessarily and voluntarily delaying or postponing something despite knowing that there will be negative consequences for doing so. The word originated from the Latin word procrastinatus, which itself evolved from the prefix pro-, meaning "forward," and crastinus, meaning "of tomorrow." Oftentimes, it is a habitual human behaviour. It is a common human experience involving delays in everyday chores or even putting off salient tasks such as attending an appointment, submitting a job report or academic assignment, or broaching a stressful issue with a partner.