Concept

Delta rule

In machine learning, the delta rule is a gradient descent learning rule for updating the weights of the inputs to artificial neurons in a single-layer neural network. It is a special case of the more general backpropagation algorithm. For a neuron with activation function , the delta rule for neuron 's th weight is given by where It holds that and . The delta rule is commonly stated in simplified form for a neuron with a linear activation function as While the delta rule is similar to the perceptron's update rule, the derivation is different. The perceptron uses the Heaviside step function as the activation function , and that means that does not exist at zero, and is equal to zero elsewhere, which makes the direct application of the delta rule impossible. The delta rule is derived by attempting to minimize the error in the output of the neural network through gradient descent. The error for a neural network with outputs can be measured as In this case, we wish to move through "weight space" of the neuron (the space of all possible values of all of the neuron's weights) in proportion to the gradient of the error function with respect to each weight. In order to do that, we calculate the partial derivative of the error with respect to each weight. For the th weight, this derivative can be written as Because we are only concerning ourselves with the th neuron, we can substitute the error formula above while omitting the summation: Next we use the chain rule to split this into two derivatives: To find the left derivative, we simply apply the chain rule: To find the right derivative, we again apply the chain rule, this time differentiating with respect to the total input to , : Note that the output of the th neuron, , is just the neuron's activation function applied to the neuron's input . We can therefore write the derivative of with respect to simply as 's first derivative: Next we rewrite in the last term as the sum over all weights of each weight times its corresponding input : Because we are only concerned with the th weight, the only term of the summation that is relevant is .

À propos de ce résultat
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.