Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
The cameras are invented by imitating the human visual system to capture the scene. The camera technologies have been substantially advanced in recent years. 108 MP resolution with 100x hybrid zoom has become standard features for smartphone flagships. In spite of the recent developments, the cameras are still restricted in terms of FoV, depth perception, pixel size, lens system, and more. That is why multicamera systems have progressively become prevalent. The multicamera systems can be a remedy to overcome the limitations that single cameras hold. The FoV can be raised to through image stitching. The depth can be computed by stereo matching methods. The lens flaws and imperfections can be handled thanks to computer vision and image processing algorithms. Significant computational power is required to process a vast amount of pixels coming from multiple cameras. GPU and FPGA are promising platforms to implement computer vision and video processing applications due to their sophisticated parallelization features. FPGA platforms present more prevailing features, especially for real-time and portable vision applications. FPGAs provide less latency as they hold the connection to image sensors from a low level. FPGA enables the implementation of a system architecture dedicated to the target application. The compact systems can be designed by designing custom PCBs using only necessary ports. Also, FPGAs consume lower power and cheaper option compared to GPUs. On the other hand, GPUs propose more versatility and easy design and upgrade time. In the light of these observations, software and hardware integrated real-time high-resolution multi-view 3D and panoramic systems are presented in the scope of this thesis.
Firstly, the depth estimation system is presented in the first part of the thesis. The proposed depth estimation system runs in real-time performance for up to 2K depth map resolution. The system adopts the trinocular scheme to address the occlusion problem. The pixel correspondence challenge in textureless-regions, from which the conventional stereo matching-based depth estimation systems suffer, is tackled by projecting artificial patterns through the integrated pico-projector. The application-specific system architecture ensures the high-performance depth map streaming.
Secondly, the drone detection and tracking system is presented. The proposed drone detection system is capable of simultaneously monitoring 360° environment and detecting the drone from a long-range in real-time performance. The distributed architecture design enables ultra-high-resolution image processing. data coming from the hardware part. The GPU design is opted due to its high level of parallelization capability. The proposed system is appropriate to be employed for surveillance applications such as drone detection, passive radar system, vast terrain, and border control applications.
Thirdly, the 3D stereoscopic panorama construction system is presented. The proposed system generated 2 separate panoramas for the left and right eyes to achieve 3D perception. The system offers the novel camera arrangement and the 3D panorama generation algorithm. The cameras are positioned to minimize the intra-panorama parallax while raising the inter-panorama parallax to augment 3D perception.
Finally, the real-time vision systems are discussed with their pros and cons, and future predictions are presented in the conclusion part of the thesis.
Edoardo Charbon, Andrei Ardelean