Current transformer-based skeletal action recognition models tend to focus on a limited set of joints and low-level motion patterns to predict action classes. This results in significant performance degradation under small skeleton perturbations or changin ...
Video DeepFakes are fake media created with Deep Learning (DL) that manipulate a person’s expression or identity. Most current DeepFake detection methods analyze each frame independently, ignoring inconsistencies and unnatural movements between frames. Som ...
In this work, we tackle the task of estimating the 6D pose of an object from point cloud data. While recent learning-based approaches have shown remarkable success on synthetic datasets, we have observed them to fail in the presence of real-world data. We ...
Designing novel materials is greatly dependent on understanding the design principles, physical mechanisms, and modeling methods of material microstructures, requiring experienced designers with expertise and several rounds of trial and error. Although rec ...
Modern neuroscience research is generating increasingly large datasets, from recording thousands of neurons over long timescales to behavioral recordings of animals spanning weeks, months, or even years. Despite a great variety in recording setups and expe ...
Recent advancements in deep learning have revolutionized 3D computer vision, enabling the extraction of intricate 3D information from 2D images and video sequences. This thesis explores the application of deep learning in three crucial challenges of 3D com ...
Photometric stereo, a computer vision technique for estimating the 3D shape of objects through images captured under varying illumination conditions, has been a topic of research for nearly four decades. In its general formulation, photometric stereo is an ...
The recent rise in interest in point clouds as an imaging modality has motivated standardization groups such as JPEG and MPEG to launch activities aiming at developing compression standards for point clouds. Lossy compression usually introduces visual arti ...
In the rapidly evolving landscape of machine learning research, neural networks stand out with their ever-expanding number of parameters and reliance on increasingly large datasets. The financial cost and computational resources required for the training p ...
A range of behavioral and contextual factors, including eating and drinking behavior, mood, social context, and other daily activities, can significantly impact an individual's quality of life and overall well-being. Therefore, inferring everyday life aspe ...