**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Person# Mattia Rossi

This person is no longer with EPFL

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related research domains (8)

Related publications (13)

Light field

The light field is a vector function that describes the amount of light flowing in every direction through every point in space. The space of all possible light rays is given by the five-dimensional plenoptic function, and the magnitude of each ray is given by its radiance. Michael Faraday was the first to propose that light should be interpreted as a field, much like the magnetic fields on which he had been working. The phrase light field was coined by Andrey Gershun in a classic 1936 paper on the radiometric properties of light in three-dimensional space.

Regularization (mathematics)

In mathematics, statistics, finance, computer science, particularly in machine learning and inverse problems, regularization is a process that changes the result answer to be "simpler". It is often used to obtain results for ill-posed problems or to prevent overfitting. Although regularization procedures can be divided in many ways, the following delineation is particularly helpful: Explicit regularization is regularization whenever one explicitly adds a term to the optimization problem.

Light field camera

A light field camera, also known as a plenoptic camera, is a camera that captures information about the light field emanating from a scene; that is, the intensity of light in a scene, and also the precise direction that the light rays are traveling in space. This contrasts with conventional cameras, which record only light intensity at various wavelengths. One type uses an array of micro-lenses placed in front of an otherwise conventional image sensor to sense intensity, color, and directional information.

Deep Neural Networks (DNNs) have the potential to improve the quality of image-based 3D reconstructions. However, the use of DNNs in the context of 3D reconstruction from large and high-resolution image datasets is still an open challenge, due to memory and computational constraints. We propose a pipeline which takes advantage of DNNs to improve the quality of 3D reconstructions while being able to handle large and high-resolution datasets. In particular, we propose a confidence prediction network explicitly tailored for Multi-View Stereo (MVS) and we use it for both depth map outlier filtering and depth map refinement within our pipeline, in order to improve the quality of the final 3D reconstructions. We train our confidence prediction network on (semi-)dense ground truth depth maps from publicly available real world MVS datasets. With extensive experiments on popular benchmarks, we show that our overall pipeline can produce state-of-the-art 3D reconstructions, both qualitatively and quantitatively.

As of today, the extension of the human visual capabilities to machines remains both a cornerstone and an open challenge in the attempt to develop intelligent systems. On the one hand, the development of more and more sophisticated imaging devices, capable of sensing richer information than a plain perspective projection of the real world, is critical in order to allow machines to understand the complex environment around them. On the other hand, despite the advances in imaging, the complexity of the real world cannot be fully captured by a single imaging device, either due to intrinsic hardware limitations or to the environment complexity itself. As a consequence, the attempt to extend the human visual capabilities to machines requires inevitably to estimate some unknown quantities, which could not be measured, from the available captured data. Equivalently, imaging requires the solution of arbitrarily complex inverse problems.
In most scenarios, inverse problems are ill-posed and admit an infinite number of solutions, while only one or few of them are the desired ones. It becomes therefore crucial to reduce, equivalently \textit{to regularize}, the solution space by exploiting all the available prior information about the problem structure and, especially, about the target quantity to estimate. In this thesis we investigate the use of graph-based regularizers to encode our prior knowledge about the target quantity and to inject it directly into the inverse problem. In particular, we cast the inverse problem into an optimization task, where the target quantity is modelled as a graph whose topology captures our prior knowledge. In order to show the effectiveness and the flexibility of graph-based regularizers, we study their use in different inverse imaging problems, each one characterized by different geometrical constraints.
We start by investigating how to augment the resolution of a light field. In fact, although light field cameras permit to capture the 3D information in a scene within a single exposure, thus providing much richer information than a perspective camera, their compact design limits their spatial resolution dramatically. We present a smooth graph-based regularizer which models the geometry of a light field explicitly and we use it to augment the light field spatial resolution while relying only on the complementary information encoded in the low resolution light field views. In particular, we show that the use of a graph-based regularizer permits to enforce the light field geometric structure without the need for a precise and costly disparity estimation step.
Then we analyze the further benefits provided by adopting nonsmooth graph-based regularizers, as these better preserve edges and fine details than their smooth counterpart. In particular, we focus on a specific nonsmooth graph-based regularizer and show its effectiveness within two applications. The first application revolves again around light field super-resolution, which permits a comparison with the smooth regularizer adopted previously. The second applications is disparity estimation in omnidirectional stereo systems, where the two captured images and the target disparity map live on a spherical surface, hence a graph-based regularizer can be used to model the non trivial correlation underneath each signal.
Finally, we investigate the refinement of a depth map and the estimation of the corresponding normal map. In fact, ...

Pascal Frossard, Mireille El Gheche, Mattia Rossi

Depth estimation is an essential component in understanding the 3D geometry of a scene, with numerous applications in urban and indoor settings. These scenes are characterized by a prevalence of human made structures, which in most of the cases, are either inherently piece-wise planar, or can be approximated as such. In these settings, we devise a novel depth refinement framework that aims at recovering the underlying piece-wise planarity of the inverse depth map. We formulate this task as an optimization problem involving a data fidelity term that minimizes the distance to the input inverse depth map, as well as a regularization that enforces a piece-wise planar solution. As for the regularization term, we model the inverse depth map as a weighted graph between pixels. The proposed regularization is designed to estimate a plane automatically at each pixel, without any need for an a priori estimation of the scene planes, and at the same time it encourages similar pixels to be assigned to the same plane. The resulting optimization problem is efficiently solved with ADAM algorithm. Experiments show that our method leads to a significant improvement in depth refinement, both visually and numerically, with respect to state-of-the-art algorithms on Middlebury, KITTI and ETH3D multi-view stereo datasets.

2020