In mathematics, low-rank approximation is a minimization problem, in which the cost function measures the fit between a given matrix (the data) and an approximating matrix (the optimization variable), subject to a constraint that the approximating matrix has reduced rank. The problem is used for mathematical modeling and data compression. The rank constraint is related to a constraint on the complexity of a model that fits the data. In applications, often there are other constraints on the approximating matrix apart from the rank constraint, e.g., non-negativity and Hankel structure.
Low-rank approximation is closely related to numerous other techniques, including principal component analysis, factor analysis, total least squares, latent semantic analysis, orthogonal regression, and dynamic mode decomposition.
Given
structure specification ,
vector of structure parameters ,
norm , and
desired rank ,
Linear system identification, in which case the approximating matrix is Hankel structured.
Machine learning, in which case the approximating matrix is nonlinearly structured.
Recommender systems, in which cases the data matrix has missing values and the approximation is categorical.
Distance matrix completion, in which case there is a positive definiteness constraint.
Natural language processing, in which case the approximation is nonnegative.
Computer algebra, in which case the approximation is Sylvester structured.
The unstructured problem with fit measured by the Frobenius norm, i.e.,
has analytic solution in terms of the singular value decomposition of the data matrix. The result is referred to as the matrix approximation lemma or Eckart–Young–Mirsky theorem. This problem was originally solved by Erhard Schmidt in the infinite dimensional context of integral operators (although his methods easily generalize to arbitrary compact operators on Hilbert spaces) and later rediscovered by C. Eckart and G. Young. L. Mirsky generalized the result to arbitrary unitarily invariant norms.