Rethinking Pose Estimation in Crowds: Overcoming the Detection Information Bottleneck and Ambiguity

Alexander Mathis
2023
Article de conférence

Résumé

Frequent interactions between individuals are a fundamental challenge for pose estimation algorithms. Current pipelines either use an object detector together with a pose estimator (top-down approach), or localize all body parts first and then link them to predict the pose of individuals (bottom-up). Yet, when individuals closely interact, top-down methods are ill-defined due to overlapping individuals, and bottom-up methods often falsely infer connections to distant bodyparts. Thus, we propose a novel pipeline called bottom-up conditioned top-down pose estimation (BUCTD) that combines the strengths of bottom-up and top-down methods. Specifically, we propose to use a bottom-up model as the detector, which in addition to an estimated bounding box provides a pose proposal that is fed as condition to an attention-based top-down model. We demonstrate the performance and efficiency of our approach on animal and human pose estimation benchmarks. On CrowdPose and OCHuman, we outperform previous state-of-the-art models by a significant margin. We achieve 78.5 AP on CrowdPose and 48.5 AP on OCHuman, an improvement of 8.6% and 7.8% over the prior art, respectively. Furthermore, we show that our method strongly improves the performance on multi-animal benchmarks involving fish and monkeys. The code is available at https://github.com/amathislab/BUCTD.

Source officielle

https://infoscience.epfl.ch/record/306250?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Rethinking Pose Estimation in Crowds: Overcoming the Detection Information Bottleneck and Ambiguity

Graph Chatbot

Chattez avec Graph Search

Aggregating Spatial and Photometric Context for Photometric Stereo

Automated Human Motion Analysis and Synthesis

Methods of trajectory estimation in challenging mapping scenarios

Aggregating Spatial and Photometric Context for Photometric Stereo

Methods of trajectory estimation in challenging mapping scenarios

Automated Human Motion Analysis and Synthesis