Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The natural next step in improving the realistic experience in multimedia services is interactive multiview video (IMV). IMV promises to enable the users to freely navigate through a scene by selecting their preferred viewpoints from any view position for which the corresponding view is generated. A smooth navigation could be achieved with camera views and views synthesized at the decoder. However, the large amount of data required for such navigation experience still represents a challenge for the current systems, which implies the need for new efficient coding strategies that permit to save on storage and transmission resources, while preserving interactivity in the navigation. In this thesis, we focus on the optimization of coding strategies for IMV systems. First, we address the issues related to the coding techniques for IMV in a multiview video plus depth (MVD) scenario, where texture and depth maps are available for view synthesis at the decoder. We propose a low complexity algorithm for the selection of the interview prediction structures (PSs) and associated texture and depth quantization parameters (QPs) for IMV under transmission and storage constraints. Simulation results show that our novel low complexity algorithm has near-optimal compression efficiency while preserving interactivity properties at the decoder. Then, considering the limited and heterogeneous capabilities of current networks and decoding devices, we propose a novel adaptive solution for IMV based on a layered multiview representation where camera views are organized into layered subsets to offer different levels of navigation quality depending on the different client constraints. We propose an optimal and a reduced computational complexity greedy algorithms that jointly select the different view subsets and their encoding rates. Simulation results show the good performance of our novel algorithms compared to a baseline algorithm, proving that an effective IMV adaptive solution should consider the scene content, the client capabilities and their preferences, in building adaptive systems for multiview navigation. Finally, we build on the solution proposed in our second problem and present a general solution to rate allocation problems in multiview video. In particular, we propose a new algorithm to find the optimal Lagrange multiplier in a Lagrangian-based rate allocation problem. We show the performance of our proposed algorithm in both multiview and monoview video scenarios and show that the proposed method is able to compete with complex state-of-the-art rate control techniques. In summary, this thesis addresses important issues for coding multiview video in the design of efficient IMV systems under resource constraints. Our algorithm to select the optimal PS and QPs in a MVD scenario can improve the quality of the rendered views and it can indeed provide new insights for a deeper understanding of specific IMV coding requirements. We show that our algorithm for a layered representation of multiview video provides an effective adaptive streaming solution for IMV systems with users with limited and heterogeneous capabilities. Finally, our proposed Lagrangian-based rate allocation algorithm with an optimized selection of the Lagrange multiplier represents a general contribution that can be used in multiple video scenarios.
Annalisa Buffa, Luca Coradello
Pascal Frossard, Laura Toni, Xue Zhang, Yao Zhao
Pascal Frossard, Laura Toni, Chenglin Li