The value function of an optimization problem gives the value attained by the objective function at a solution, while only depending on the parameters of the problem. In a controlled dynamical system, the value function represents the optimal payoff of the system over the interval [t, t1] when started at the time-t state variable x(t)=x. If the objective function represents some cost that is to be minimized, the value function can be interpreted as the cost to finish the optimal program, and is thus referred to as "cost-to-go function." In an economic context, where the objective function usually represents utility, the value function is conceptually equivalent to the indirect utility function. In a problem of optimal control, the value function is defined as the supremum of the objective function taken over the set of admissible controls. Given , a typical optimal control problem is to subject to with initial state variable . The objective function is to be maximized over all admissible controls , where is a Lebesgue measurable function from to some prescribed arbitrary set in . The value function is then defined as with , where is the "scrap value". If the optimal pair of control and state trajectories is , then . The function that gives the optimal control based on the current state is called a feedback control policy, or simply a policy function. Bellman's principle of optimality roughly states that any optimal policy at time , taking the current state as "new" initial condition must be optimal for the remaining problem. If the value function happens to be continuously differentiable, this gives rise to an important partial differential equation known as Hamilton–Jacobi–Bellman equation, where the maximand on the right-hand side can also be re-written as the Hamiltonian, , as with playing the role of the costate variables. Given this definition, we further have , and after differentiating both sides of the HJB equation with respect to , which after replacing the appropriate terms recovers the costate equation where is Newton notation for the derivative with respect to time.

À propos de ce résultat
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.