3D pose estimation3D pose estimation is a process of predicting the transformation of an object from a user-defined reference pose, given an image or a 3D scan. It arises in computer vision or robotics where the pose or transformation of an object can be used for alignment of a computer-aided design models, identification, grasping, or manipulation of the object. The image data from which the pose of an object is determined can be either a single image, a stereo image pair, or an image sequence where, typically, the camera is moving with a known velocity.
Image textureAn image texture is a set of metrics calculated in image processing designed to quantify the perceived texture of an image. Image texture gives us information about the spatial arrangement of color or intensities in an image or selected region of an image. Image textures can be artificially created or found in natural scenes captured in an image. Image textures are one way that can be used to help in or . For more accurate segmentation the most useful features are spatial frequency and an average grey level.
Depth mapIn 3D computer graphics and computer vision, a depth map is an or that contains information relating to the distance of the surfaces of scene objects from a viewpoint. The term is related (and may be analogous) to depth buffer, Z-buffer, Z-buffering, and Z-depth. The "Z" in these latter terms relates to a convention that the central axis of view of a camera is in the direction of the camera's Z axis, and not to the absolute Z axis of a scene. File:Cubic Structure.jpg|Cubic Structure File:Cubic Frame Stucture and Floor Depth Map.
Facial motion captureFacial motion capture is the process of electronically converting the movements of a person's face into a digital database using cameras or laser scanners. This database may then be used to produce computer graphics (CG), computer animation for movies, games, or real-time avatars. Because the motion of CG characters is derived from the movements of real people, it results in a more realistic and nuanced computer character animation than if the animation were created manually.
Pose (computer vision)In the fields of computing and computer vision, pose (or spatial pose) represents the position and orientation of an object, usually in three dimensions. Poses are often stored internally as transformation matrices. The term “pose” is largely synonymous with the term “transform”, but a transform may often include scale, whereas pose does not. In computer vision, the pose of an object is often estimated from camera input by the process of pose estimation.
Multimedia information retrievalMultimedia information retrieval (MMIR or MIR) is a research discipline of computer science that aims at extracting semantic information from multimedia data sources. Data sources include directly perceivable media such as audio, and video, indirectly perceivable sources such as text, semantic descriptions, biosignals as well as not perceivable sources such as bioinformation, stock prices, etc. The methodology of MMIR can be organized in three groups: Methods for the summarization of media content (feature extraction).
Contextual image classificationContextual image classification, a topic of pattern recognition in computer vision, is an approach of classification based on contextual information in images. "Contextual" means this approach is focusing on the relationship of the nearby pixels, which is also called neighbourhood. The goal of this approach is to classify the images by using the contextual information. Similar as processing language, a single word may have multiple meanings unless the context is provided, and the patterns within the sentences are the only informative segments we care about.
EigenfaceAn eigenface (ˈaɪgənˌfeɪs) is the name given to a set of eigenvectors when used in the computer vision problem of human face recognition. The approach of using eigenfaces for recognition was developed by Sirovich and Kirby and used by Matthew Turk and Alex Pentland in face classification. The eigenvectors are derived from the covariance matrix of the probability distribution over the high-dimensional vector space of face images. The eigenfaces themselves form a basis set of all images used to construct the covariance matrix.
Kernel (image processing)In , a kernel, convolution matrix, or mask is a small matrix used for blurring, sharpening, embossing, edge detection, and more. This is accomplished by doing a convolution between the kernel and an . Or more simply, when each pixel in the output image is a function of the nearby pixels (including itself) in the input image, the kernel is that function. The general expression of a convolution is where is the filtered image, is the original image, is the filter kernel. Every element of the filter kernel is considered by and .
Homography (computer vision)In the field of computer vision, any two images of the same planar surface in space are related by a homography (assuming a pinhole camera model). This has many practical applications, such as , , or camera motion—rotation and translation—between two images. Once camera resectioning has been done from an estimated homography matrix, this information may be used for navigation, or to insert models of 3D objects into an image or video, so that they are rendered with the correct perspective and appear to have been part of the original scene (see Augmented reality).