Catégorie

Vision par ordinateur

Concepts associés (132)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

3D pose estimation

3D pose estimation is a process of predicting the transformation of an object from a user-defined reference pose, given an image or a 3D scan. It arises in computer vision or robotics where the pose or transformation of an object can be used for alignment of a computer-aided design models, identification, grasping, or manipulation of the object. The image data from which the pose of an object is determined can be either a single image, a stereo image pair, or an image sequence where, typically, the camera is moving with a known velocity.

Texture (traitement des images)

thumb|right|alt=Artificial texture example.|Exemple de texture artificielle thumb|right|alt=Natural texture example.|Exemple de texture naturelle Dans le , l'analyse de texture consiste à calculer une série de mesures dans le but de définir une texture perçue sur une image. L'analyse de texture renvoie des informations sur l'arrangement spatial des couleurs ou des intensités dans tout ou partie de cette image. Les données de texture d'une image peuvent être artificiellement créées (textures artificielles) ou résulter de l'analyse d'images filmées à partir de scènes ou d'objets réels (textures naturelles).

Carte de disparité

In 3D computer graphics and computer vision, a depth map is an or that contains information relating to the distance of the surfaces of scene objects from a viewpoint. The term is related (and may be analogous) to depth buffer, Z-buffer, Z-buffering, and Z-depth. The "Z" in these latter terms relates to a convention that the central axis of view of a camera is in the direction of the camera's Z axis, and not to the absolute Z axis of a scene. File:Cubic Structure.jpg|Cubic Structure File:Cubic Frame Stucture and Floor Depth Map.

Facial motion capture

Facial motion capture is the process of electronically converting the movements of a person's face into a digital database using cameras or laser scanners. This database may then be used to produce computer graphics (CG), computer animation for movies, games, or real-time avatars. Because the motion of CG characters is derived from the movements of real people, it results in a more realistic and nuanced computer character animation than if the animation were created manually.

Pose (computer vision)

In the fields of computing and computer vision, pose (or spatial pose) represents the position and orientation of an object, usually in three dimensions. Poses are often stored internally as transformation matrices. The term “pose” is largely synonymous with the term “transform”, but a transform may often include scale, whereas pose does not. In computer vision, the pose of an object is often estimated from camera input by the process of pose estimation.

Multimedia information retrieval

Multimedia information retrieval (MMIR or MIR) is a research discipline of computer science that aims at extracting semantic information from multimedia data sources. Data sources include directly perceivable media such as audio, and video, indirectly perceivable sources such as text, semantic descriptions, biosignals as well as not perceivable sources such as bioinformation, stock prices, etc. The methodology of MMIR can be organized in three groups: Methods for the summarization of media content (feature extraction).

Contextual image classification

Contextual image classification, a topic of pattern recognition in computer vision, is an approach of classification based on contextual information in images. "Contextual" means this approach is focusing on the relationship of the nearby pixels, which is also called neighbourhood. The goal of this approach is to classify the images by using the contextual information. Similar as processing language, a single word may have multiple meanings unless the context is provided, and the patterns within the sentences are the only informative segments we care about.

Eigenface

Les eigenfaces sont un ensemble de vecteurs propres utilisés dans le domaine de la vision artificielle afin de résoudre le problème de la reconnaissance du visage humain. Le recours à des eigenfaces pour la reconnaissance a été développé par Sirovich et Kirby (1987) et utilisé par Matthew Turk et Alex Pentland pour la classification de visages. Cette méthode est considérée comme le premier exemple réussi de technologie de reconnaissance faciale.

Noyau (traitement d'image)

En , un noyau, une matrice de convolution ou un masque est une petite matrice utilisée pour le floutage, l'amélioration de la netteté de l'image, le gaufrage, la détection de contours, et d'autres. Tout cela est accompli en faisant une convolution entre le noyau et l'. Suivant les valeurs, un noyau peut avoir un grand nombre d'effets. Ce sont juste quelques exemples montrant les effets des noyaux de convolutions sur les images. L'origine est la position du noyau qui est au-dessus (conceptuellement) du pixel courant.

Homography (computer vision)

In the field of computer vision, any two images of the same planar surface in space are related by a homography (assuming a pinhole camera model). This has many practical applications, such as , , or camera motion—rotation and translation—between two images. Once camera resectioning has been done from an estimated homography matrix, this information may be used for navigation, or to insert models of 3D objects into an image or video, so that they are rendered with the correct perspective and appear to have been part of the original scene (see Augmented reality).