Summary
Motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions but the images are a projection of the 3D scene onto a 2D plane. The motion vectors may relate to the whole image (global motion estimation) or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. The motion vectors may be represented by a translational model or many other models that can approximate the motion of a real video camera, such as rotation and translation in all three dimensions and zoom. More often than not, the term motion estimation and the term optical flow are used interchangeably. It is also related in concept to and stereo correspondence. In fact all of these terms refer to the process of finding corresponding points between two images or video frames. The points that correspond to each other in two views (images or frames) of a real scene or object are "usually" the same point in that scene or on that object. Before we do motion estimation, we must define our measurement of correspondence, i.e., the matching metric, which is a measurement of how similar two image points are. There is no right or wrong here; the choice of matching metric is usually related to what the final estimated motion is used for as well as the optimisation strategy in the estimation process. Each motion vector is used to represent a macroblock in a picture based on the position of this macroblock (or a similar one) in another picture, called the reference picture. The H.264/MPEG-4 AVC standard defines motion vector as: motion vector: a two-dimensional vector used for inter prediction that provides an offset from the coordinates in the decoded picture to the coordinates in a reference picture. The methods for finding motion vectors can be categorised into pixel based methods ("direct") and feature based methods ("indirect").
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (24)
PHYS-101(en): General physics : mechanics (English)
Students will learn the principles of mechanics to enable a better understanding of physical phenomena, such as the kinematics and dyamics of point masses and solid bodies. Students will acquire the c
ME-201: Continuum mechanics
Continuum conservation laws (e.g. mass, momentum and energy) will be introduced. Mathematical tools, including basic algebra and calculus of vectors and Cartesian tensors will be taught. Stress and de
EE-550: Image and video processing
This course covers fundamental notions in image and video processing, as well as covers most popular tools used, such as edge detection, motion estimation, segmentation, and compression. It is compose
Show more
Related lectures (95)
Block Pulled by a Spring: Dynamics
Covers the dynamics of a block connected to a spring, deriving equations of motion and solving for key time points.
Continuum Mechanics: Modeling Deformation and Flow
Covers the fundamentals of Continuum Mechanics, focusing on modeling deformation and flow in materials.
Motion
Explores motion estimation methods in video processing, covering displacement, motion field, optical flow, and various techniques like gradient methods and block matching.
Show more
Related publications (289)

Efficient Temporally-Aware DeepFake Detection using H.264 Motion Vectors

Sabine Süsstrunk, Yufan Ren, Peter Arpad Grönquist, Alessio Verardo, Qingyi He

Video DeepFakes are fake media created with Deep Learning (DL) that manipulate a person’s expression or identity. Most current DeepFake detection methods analyze each frame independently, ignoring inconsistencies and unnatural movements between frames. Som ...
2024
Show more
Related concepts (12)
High-definition television
High-definition television (HD or HDTV) describes a television system which provides a substantially higher than the previous generation of technologies. The term has been used since 1936; in more recent times, it refers to the generation following standard-definition television (SDTV), often abbreviated to HDTV or HD-TV. It is the current de facto standard video format used in most broadcasts: terrestrial broadcast television, cable television, satellite television and Blu-ray Discs.
Video coding format
A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digital video content (such as in a data file or bitstream). It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation. A specific software, firmware, or hardware implementation capable of compression or decompression to/from a specific video coding format is called a video codec.
Scale-invariant feature transform
The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, , 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving. SIFT keypoints of objects are first extracted from a set of reference images and stored in a database.
Show more
Related MOOCs (17)
Plasma Physics: Introduction
Learn the basics of plasma, one of the fundamental states of matter, and the different types of models used to describe it, including fluid and kinetic.
Plasma Physics: Introduction
Learn the basics of plasma, one of the fundamental states of matter, and the different types of models used to describe it, including fluid and kinetic.
Plasma Physics: Applications
Learn about plasma applications from nuclear fusion powering the sun, to making integrated circuits, to generating electricity.
Show more