In mathematics, the subderivative, subgradient, and subdifferential generalize the derivative to convex functions which are not necessarily differentiable. Subderivatives arise in convex analysis, the study of convex functions, often in connection to convex optimization.
Let be a real-valued convex function defined on an open interval of the real line. Such a function need not be differentiable at all points: For example, the absolute value function is non-differentiable when . However, as seen in the graph on the right (where in blue has non-differentiable kinks similar to the absolute value function), for any in the domain of the function one can draw a line which goes through the point and which is everywhere either touching or below the graph of f. The slope of such a line is called a subderivative.
Rigorously, a subderivative of a convex function at a point in the open interval is a real number such that
for all . By the converse of the mean value theorem, the set of subderivatives at for a convex function is a nonempty closed interval , where and are the one-sided limits
The set of all subderivatives is called the subdifferential of the function at , denoted by . If is convex, then its subdifferential at any point is non-empty. Moreover, if its subdifferential at contains exactly one subderivative, then and is differentiable at .
Consider the function which is convex. Then, the subdifferential at the origin is the interval . The subdifferential at any point is the singleton set , while the subdifferential at any point is the singleton set . This is similar to the sign function, but is not single-valued at , instead including all possible subderivatives.
A convex function is differentiable at if and only if the subdifferential is a singleton set, which is .
A point is a global minimum of a convex function if and only if zero is contained in the subdifferential. For instance, in the figure above, one may draw a horizontal "subtangent line" to the graph of at .
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
This course teaches an overview of modern optimization methods, for applications in machine learning and data science. In particular, scalability of algorithms to large datasets will be discussed in t
This course provides an overview of key advances in continuous optimization and statistical analysis for machine learning. We review recent learning formulations and models as well as their guarantees
Machine learning methods are becoming increasingly central in many sciences and applications. In this course, fundamental principles and methods of machine learning will be introduced, analyzed and pr
Explores Gradient Descent for Linear MSE in machine learning, covering computation, complexity, variants, Stochastic Gradient Descent, penalty functions, implementation issues, and non-convex optimization.
Explores optimization methods, including convexity, gradient descent, and non-convex minimization, with examples like maximum likelihood estimation and ridge regression.
Explores the role of computation in optimization, focusing on gradient descent for convex and nonconvex problems.
Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.
In mathematics, a real-valued function is called convex if the line segment between any two distinct points on the graph of the function lies above the graph between the two points. Equivalently, a function is convex if its epigraph (the set of points on or above the graph of the function) is a convex set. A twice-differentiable function of a single variable is convex if and only if its second derivative is nonnegative on its entire domain.
, , , ,
The integration of discrete choice models with Mixed Integer Linear Programming (MILP) models provides a better understanding of customers' preferences to operators while planning
for their systems. However, the formulations associated with the former are ...
2018
, , , ,
The integration of discrete choice models in Mixed Integer Linear Programming (MILP) models provides a better understanding of the preferences of the customers to the operators while planning for their systems. However, the formulations associated with ch ...
We formulate gradient-based Markov chain Monte Carlo (MCMC) sampling as optimization on the space of probability measures, with Kullback-Leibler (KL) divergence as the objective functional. We show that an under-damped form of the Langevin algorithm perfor ...