Varimax rotationIn statistics, a varimax rotation is used to simplify the expression of a particular sub-space in terms of just a few major items each. The actual coordinate system is unchanged, it is the orthogonal basis that is being rotated to align with those coordinates. The sub-space found with principal component analysis or factor analysis is expressed as a dense basis with many non-zero weights which makes it hard to interpret. Varimax is so called because it maximizes the sum of the variances of the squared loadings (squared correlations between variables and factors).
Tensor rank decompositionIn multilinear algebra, the tensor rank decomposition or the decomposition of a tensor is the decomposition of a tensor in terms of a sum of minimum tensors. This is an open problem. Canonical polyadic decomposition (CPD) is a variant of the rank decomposition which computes the best fitting terms for a user specified . The CP decomposition has found some applications in linguistics and chemometrics. The CP rank was introduced by Frank Lauren Hitchcock in 1927 and later rediscovered several times, notably in psychometrics.
Multiple correspondence analysisIn statistics, multiple correspondence analysis (MCA) is a data analysis technique for nominal categorical data, used to detect and represent underlying structures in a data set. It does this by representing data as points in a low-dimensional Euclidean space. The procedure thus appears to be the counterpart of principal component analysis for categorical data. MCA can be viewed as an extension of simple correspondence analysis (CA) in that it is applicable to a large set of categorical variables.
Lanczos algorithmThe Lanczos algorithm is an iterative method devised by Cornelius Lanczos that is an adaptation of power methods to find the "most useful" (tending towards extreme highest/lowest) eigenvalues and eigenvectors of an Hermitian matrix, where is often but not necessarily much smaller than . Although computationally efficient in principle, the method as initially formulated was not useful, due to its numerical instability. In 1970, Ojalvo and Newman showed how to make the method numerically stable and applied it to the solution of very large engineering structures subjected to dynamic loading.
Rayleigh quotientIn mathematics, the Rayleigh quotient (ˈreɪ.li) for a given complex Hermitian matrix and nonzero vector is defined as:For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugate transpose to the usual transpose . Note that for any non-zero scalar . Recall that a Hermitian (or real symmetric) matrix is diagonalizable with only real eigenvalues. It can be shown that, for a given matrix, the Rayleigh quotient reaches its minimum value (the smallest eigenvalue of ) when is (the corresponding eigenvector).
Weka (software)Waikato Environment for Knowledge Analysis (Weka) is a collection of machine learning and data analysis free software licensed under the GNU General Public License. It was developed at the University of Waikato, New Zealand and is the companion software to the book "Data Mining: Practical Machine Learning Tools and Techniques". Weka contains a collection of visualization tools and algorithms for data analysis and predictive modeling, together with graphical user interfaces for easy access to these functions.
Total least squaresIn applied statistics, total least squares is a type of errors-in-variables regression, a least squares data modeling technique in which observational errors on both dependent and independent variables are taken into account. It is a generalization of Deming regression and also of orthogonal regression, and can be applied to both linear and non-linear models. The total least squares approximation of the data is generically equivalent to the best, in the Frobenius norm, low-rank approximation of the data matrix.
Ordination (statistics)Ordination or gradient analysis, in multivariate analysis, is a method complementary to data clustering, and used mainly in exploratory data analysis (rather than in hypothesis testing). In contrast to cluster analysis, ordination orders quantities in a (usually lower-dimensional) latent space. In the ordination space, quantities that are near each other share attributes (i.e., are similar to some degree), and dissimilar objects are farther from each other.