Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In recent years, we have seen a dramatic increase in the amount of video data recorded and stored around the world. Driven by the availability of low-cost video cameras, the ever-decreasing cost of digital media storage, and the explosion in popularity of video sharing across the Internet, there is a growing demand for sophisticated methods to automatically analyze and understand video content. One of the most fundamental processes to understanding video content is visual multi-object tracking, which is the process of locating, identifying, and determining the dynamic configuration of one or many moving (possibly deformable) objects in each frame of a video sequence. In this dissertation, we focus on a general probabilistic approach known as recursive state-space Bayesian estimation, which estimates the unknown probability distribution of the state of the objects recursively over time, using information extracted from video data. The central problem addressed in this dissertation is the development of novel probabilistic models using this framework to perform accurate, robust automatic visual multi-object tracking. In addressing this problem, we consider the following questions: What types of probabilistic models can we develop to improve the state-of-the-art, and where do the improvements come from? What benefits and drawbacks are associated with these models? How can we objectively evaluate the performance of a multi-object tracking model? How can a probabilistic multi-object tracking model be extended to perform human activity recognition tasks? Over the course of our work, we attempt to provide an answer to each of these questions, beginning with a proposal for a comprehensive set of measures and a formal evaluation protocol for evaluating multi-object tracking performance. We proceed by defining two new probabilistic tracking models: one which improves the efficiency of a state-of-the-art model, the Distributed Partitioned Sampling Particle Filter (DPS PF), and one which provides a formal framework for efficiently tracking a variable number of objects, the Reversible Jump Markov Chain Monte Carlo Particle Filter (RJMCMC PF). Using our proposed evaluation framework, we compare our proposed models with other state-of-the-art tracking methods in a meeting room head tracking task. Finally, we show how the RJMCMC PF can be applied to human activity recognition tasks such as detecting abandoned luggage items in a busy train terminal and determining if and when pedestrians look at an outdoor advertisement as they pass.
Dario Floreano, Valentin Wüest, Fabian Maximilian Schilling, Koji Minoda