Pascal Frossard, Adel Rahmoune, Pierre Vandergheynst
This paper presents a highly flexible video coding scheme (MP3D), based on the use of a redundant 3-D spatio-temporal dictionary of functions. Directionality and anisotropic scaling are key ingredients to the spatial components, that form a rich collection of 2-D visual primitives. The temporal component is tuned to capture most of the energy in the temporal signal evolution, along motion trajectories in the video sequences. The MP3D video coding scheme first computes motion trajectories, that are lossless entropy coded and sent as side information to the decoder. It then applies a spatio-temporal decomposition using an adaptive approximation algorithm based on Matching Pursuit (MP). Quantized coefficients and basis function parameters are entropy-coded in a embedded stream that is constructed to respect multiple rate constraints. The geometric properties of the 2-D primitive dictionary allows for flexible spatial resolution adaptation, so that the MP3D stream allows for decoding at multiple rate and spatio-temporal resolutions. The MP3D scheme is shown to provide comparable rate-distortion performances at low and medium bit rates against state-of-the-art schemes, like H.264 and MPEG-4, or the scalable MC EZBC. It also provides an increased flexibility in stream manipulation to adapt to non-octave based spatial resolutions, or to any rate constraints. However, the use of a redundant dictionary is penalizing at high coding rates, which makes the MP3D algorithm interesting for low rate applications, or as a flexible base layer for higher rate video systems.