Publication

Human Detection and Segmentation via Multi-view Consensus

Abstract

Self-supervised detection and segmentation of foreground objects aims for accuracy without annotated training data. However, existing approaches predominantly rely on restrictive assumptions on appearance and motion. For scenes with dynamic activities and camera motion, we propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training via coarse 3D localization in a voxel grid and fine-grained offset regression. In this manner, we learn a joint distribution of proposals over multiple views. At inference time, our method operates on single RGB images. We outperform state-of-the-art techniques both on images that visually depart from those of standard benchmarks and on those of the classical Human3.6M dataset.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (32)
Unsupervised learning
Unsupervised learning, is paradigm in machine learning where, in contrast to supervised learning and semi-supervised learning, algorithms learn patterns exclusively from unlabeled data. Neural network tasks are often categorized as discriminative (recognition) or generative (imagination). Often but not always, discriminative tasks use supervised methods and generative tasks use unsupervised (see Venn diagram); however, the separation is very hazy. For example, object recognition favors supervised learning but unsupervised learning can also cluster objects into groups.
Object co-segmentation
In computer vision, object co-segmentation is a special case of , which is defined as jointly segmenting semantically similar objects in multiple images or video frames. It is often challenging to extract segmentation masks of a target/object from a noisy collection of images or video frames, which involves object discovery coupled with . A noisy collection implies that the object/target is present sporadically in a set of images or the object/target disappears intermittently throughout the video of interest.
Feature learning
In machine learning, feature learning or representation learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature engineering and allows a machine to both learn the features and use them to perform a specific task. Feature learning is motivated by the fact that machine learning tasks such as classification often require input that is mathematically and computationally convenient to process.
Show more
Related publications (32)

Aggregating Spatial and Photometric Context for Photometric Stereo

David Honzátko

Photometric stereo, a computer vision technique for estimating the 3D shape of objects through images captured under varying illumination conditions, has been a topic of research for nearly four decades. In its general formulation, photometric stereo is an ...
EPFL2024

Robust machine learning for neuroscientific inference

Steffen Schneider

Modern neuroscience research is generating increasingly large datasets, from recording thousands of neurons over long timescales to behavioral recordings of animals spanning weeks, months, or even years. Despite a great variety in recording setups and expe ...
EPFL2024

Self-supervised Dense Representation Learning for Live-Cell Microscopy with Time Arrow Prediction

Martin Weigert, Benjamin Tobias Gallusser, Max Stieber

State-of-the-art object detection and segmentation methods for microscopy images rely on supervised machine learning, which requires laborious manual annotation of training data. Here we present a self-supervised method based on time arrow prediction pre-t ...
Cham2023
Show more
Related MOOCs (1)
IoT Systems and Industrial Applications with Design Thinking
The first MOOC to provide a comprehensive introduction to Internet of Things (IoT) including the fundamental business aspects needed to define IoT related products.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.