Video tracking is the process of locating a moving object (or multiple objects) over time using a camera. It has a variety of uses, some of which are: human-computer interaction, security and surveillance, video communication and compression, augmented reality, traffic control, medical imaging and video editing. Video tracking can be a time-consuming process due to the amount of data that is contained in video. Adding further to the complexity is the possible need to use object recognition techniques for tracking, a challenging problem in its own right.
The objective of video tracking is to associate target objects in consecutive video frames. The association can be especially difficult when the objects are moving fast relative to the frame rate. Another situation that increases the complexity of the problem is when the tracked object changes orientation over time. For these situations video tracking systems usually employ a motion model which describes how the image of the target might change for different possible motions of the object.
Examples of simple motion models are:
When tracking planar objects, the motion model is a 2D transformation (affine transformation or homography) of an image of the object (e.g. the initial frame).
When the target is a rigid 3D object, the motion model defines its aspect depending on its 3D position and orientation.
For video compression, key frames are divided into macroblocks. The motion model is a disruption of a key frame, where each macroblock is translated by a motion vector given by the motion parameters.
The image of deformable objects can be covered with a mesh, the motion of the object is defined by the position of the nodes of the mesh.
To perform video tracking an algorithm analyzes sequential video frames and outputs the movement of targets between the frames. There are a variety of algorithms, each having strengths and weaknesses. Considering the intended use is important when choosing which algorithm to use.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
This summer school is an hands-on introduction on the fundamentals of image analysis for scientists. A series of lectures provide students with the key concepts in the field, and are followed by pract
The course provides a comprehensive overview of methods, algorithms, and computer tools used in bioimage analysis. It exposes fundamental concepts and practical computer solutions to extract quantitat
Le cours suivi propose une initiation aux concepts de base de la programmation impérative tels que : variables, expressions, structures de contrôle, fonctions/méthodes, en les illustrant dans la synta
Le cours suivi propose une introduction aux concepts de base de la programmation orientée objet tels que : encapsulation et abstraction, classes/objets, attributs/méthodes, héritage, polymorphisme, ..
Ce cours initie à la programmation en utilisant le langage C++. Il ne présuppose pas de connaissance préalable. Les aspects plus avancés (programmation orientée objet) sont donnés dans un cours suivan
OpenCV (Open Source Computer Vision Library) is a library of programming functions mainly for real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage, then Itseez (which was later acquired by Intel). The library is cross-platform and licensed as free and open-source software under Apache License 2. Starting in 2011, OpenCV features GPU acceleration for real-time operations.
Motion capture (sometimes referred as mo-cap or mocap, for short) is the process of recording the movement of objects or people. It is used in military, entertainment, sports, medical applications, and for validation of computer vision and robots. In filmmaking and video game development, it refers to recording actions of human actors and using that information to animate digital character models in 2D or 3D computer animation. When it includes face and fingers or captures subtle expressions, it is often referred to as performance capture.
Computer vision tasks include methods for , , and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the forms of decisions. Understanding in this context means the transformation of visual images (the input to the retina in the human analog) into descriptions of the world that make sense to thought processes and can elicit appropriate action.
In this letter, we introduce an optimal transport framework for inferring power distributions over both spatial location and temporal frequency. Recently, it has been shown that optimal transport is a powerful tool for estimating spatial spectra that chang ...
Deep learning has revolutionized the field of computer vision, a success largely attributable to the growing size of models, datasets, and computational power.Simultaneously, a critical pain point arises as several computer vision applications are deployed ...
EPFL2024
Recent advancements in deep learning have revolutionized 3D computer vision, enabling the extraction of intricate 3D information from 2D images and video sequences. This thesis explores the application of deep learning in three crucial challenges of 3D com ...