Clustering high-dimensional dataClustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions. Such high-dimensional spaces of data are often encountered in areas such as medicine, where DNA microarray technology can produce many measurements at once, and the clustering of text documents, where, if a word-frequency vector is used, the number of dimensions equals the size of the vocabulary.
Sparse approximationSparse approximation (also known as sparse representation) theory deals with sparse solutions for systems of linear equations. Techniques for finding these solutions and exploiting them in applications have found wide use in , signal processing, machine learning, medical imaging, and more. Consider a linear system of equations , where is an underdetermined matrix and . The matrix (typically assumed to be full-rank) is referred to as the dictionary, and is a signal of interest.
Multilinear subspace learningMultilinear subspace learning is an approach for disentangling the causal factor of data formation and performing dimensionality reduction. The Dimensionality reduction can be performed on a data tensor that contains a collection of observations have been vectorized, or observations that are treated as matrices and concatenated into a data tensor. Here are some examples of data tensors whose observations are vectorized or whose observations are matrices concatenated into data tensor s (2D/3D), video sequences (3D/4D), and hyperspectral cubes (3D/4D).
Scale space implementationIn the areas of computer vision, and signal processing, the notion of scale-space representation is used for processing measurement data at multiple scales, and specifically enhance or suppress image features over different ranges of scale (see the article on scale space). A special type of scale-space representation is provided by the Gaussian scale space, where the image data in N dimensions is subjected to smoothing by Gaussian convolution.
Scale spaceScale-space theory is a framework for multi-scale signal representation developed by the computer vision, and signal processing communities with complementary motivations from physics and biological vision. It is a formal theory for handling image structures at different scales, by representing an image as a one-parameter family of smoothed images, the scale-space representation, parametrized by the size of the smoothing kernel used for suppressing fine-scale structures.
Test automationIn software testing, test automation is the use of software separate from the software being tested to control the execution of tests and the comparison of actual outcomes with predicted outcomes. Test automation can automate some repetitive but necessary tasks in a formalized testing process already in place, or perform additional testing that would be difficult to do manually. Test automation is critical for continuous delivery and continuous testing.
Software testingSoftware testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation. Test techniques include, but are not necessarily limited to: analyzing the product requirements for completeness and correctness in various contexts like industry perspective, business perspective, feasibility and viability of implementation, usability, performance, security, infrastructure considerations, etc.
Matching pursuitMatching pursuit (MP) is a sparse approximation algorithm which finds the "best matching" projections of multidimensional data onto the span of an over-complete (i.e., redundant) dictionary . The basic idea is to approximately represent a signal from Hilbert space as a weighted sum of finitely many functions (called atoms) taken from . An approximation with atoms has the form where is the th column of the matrix and is the scalar weighting factor (amplitude) for the atom . Normally, not every atom in will be used in this sum.
Binary classificationBinary classification is the task of classifying the elements of a set into two groups (each called class) on the basis of a classification rule. Typical binary classification problems include: Medical testing to determine if a patient has certain disease or not; Quality control in industry, deciding whether a specification has been met; In information retrieval, deciding whether a page should be in the result set of a search or not. Binary classification is dichotomization applied to a practical situation.
Performance engineeringPerformance engineering encompasses the techniques applied during a systems development life cycle to ensure the non-functional requirements for performance (such as throughput, latency, or memory usage) will be met. It may be alternatively referred to as systems performance engineering within systems engineering, and software performance engineering or application performance engineering within software engineering.