Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In free viewpoint video systems, where a user has the freedom to select a virtual view from which an observation image of the 3D scene is rendered, the scene is commonly represented by texture and depth images from multiple nearby viewpoints. In such representation, there exists data redundancy across multiple dimensions: a single visible 3D voxel may be represented by pixels in multiple viewpoint images (inter-view redundancy), a pixel patch may recur in a distant spatial region of the same image due to self-similarity (inter-patch redundancy), and pixels in a local spatial region tend to be similar (inter-pixel redundancy). It isimportant to exploit these redundancies for effective multiview video compression. Existing schemes attempt to eliminate them via the traditional video coding paradigm of hybrid signal prediction/residual coding; typically, the encoder codes explicit information to guide the decoder to the location of the most similar block along with the signal differential. In this paper, we argue that, given the inherent redundancy in the representation, the decoder can often independently recover missing data via inpainting without explicit directions from encoder, resulting in lower coding overhead. Specifically, after pixels in a reference view are projected to a target view via depth image-based rendering (DIBR) at the decoder, the remaining holes in the target view are filled via an inpainting process in a block-by-block manner. First, blocks are ordered in terms of difficulty-to-inpaint by the decoder. Then, explicit instructions are only sent for the reconstruction of the most difficult blocks. In particular, the missing pixels are explicitly coded via a graph Fourier transform (GFT) or a sparsification procedure using DCT, which leads to low coding cost. For the blocks that are easy to inpaint, the decoder independently completes missing pixels via template-based inpainting. We implemented our encoder-driven inpainting strategy as an extension of High Efficiency Video Coding (HEVC). Experimental results show that our coding strategy can outperform comparable implementation of HEVC by up to 0.8dB in reconstructed image quality
Sabine Süsstrunk, Yufan Ren, Peter Arpad Grönquist, Alessio Verardo, Qingyi He
Andreas Peter Burg, Alexios Konstantinos Balatsoukas Stimming, Andreas Toftegaard Kristensen, Yifei Shen, Yuqing Ren, Leyu Zhang, Chuan Zhang