Visual Scene Understanding for Transportation: From Detecting Objects To Relationships

George Adaimi
2022
Thèse EPFL

Résumé

Transportation, which deals with moving people and goods around, has a clear impact on the economic development of our society and our well-being. Traditionally, transportation was studied and analyzed using expensive sensors, such as induction loops, that are difficult to maintain. However, nowadays with the prevalence of cameras, that are inexpensive and can be easily mounted in various areas, computer vision has found its way into the transportation domain. Computer vision is a field in artificial intelligence (AI) that uses visual data to extract high-level and relevant information. While there have been several advances in computer vision, several challenges arise when dealing with vision-based transportation systems deployed in complex and uncontrolled environments.This doctoral thesis aims to introduce various vision-based deep learning methods that can handle the challenges suffered in the transportation and mobility domain and are crucial in providing a high-level understanding of a scene. We first tackle an important task in transportation, detecting different agents (e.g. vehicles and pedestrians) in diverse environments such as roads, parking lots, or sidewalks. To solve this task, we propose to leverage dense fields, referred to as Butterfly Fields, as representations to localize and classify all objects in the scene. Using dense representations enables our method to handle the challenges of object occlusion and scale variations in aerial images. Furthermore, understanding the movement of goods and people over time is critical for various transportation operations. Tracking people and objects requires re-identifying agents across images which can be challenging due to the visual ambiguity that occurs when independent traffic agents, especially vehicles, are visually-similar. Thus, we solve this challenge with a confidence-based learning framework and demonstrate a boost in performance of several re-identification methods irrespective of the type of agent, whether it is a vehicle or a person. We further show the benefit and efficacy of our detector and tracker on a common and important traffic management task. Beyond detecting and re-identifying agents in a scene, extracting the relationship between different objects is another important task in transportation. This problem can be solved using scene graph generation methods that extract a structured semantic representation of a scene by detecting the objects present and their relationships. In transportation, scene graphs are mainly used as inputs to real-time downstream decision-making tasks and thus it is important that such methods be efficient while providing good performance. Towards that end, we develop an efficient one-step scene graph generation method that provides a comprehensive understanding of a scene. Finally, since open-source is an enabler of innovation, we contribute to the collective knowledge in the field of computer vision and transportation by publicly sharing our new roundabout dataset, and the source code and models of our work.

Source officielle

https://infoscience.epfl.ch/record/298436?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Visual Scene Understanding for Transportation: From Detecting Objects To Relationships

Graph Chatbot

Chattez avec Graph Search

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Fast and Future: Towards Efficient Forecasting in Video Semantic Segmentation

Advancing Self-Supervised Deep Learning for 3D Scene Understanding

Advancing Self-Supervised Deep Learning for 3D Scene Understanding

Fast and Future: Towards Efficient Forecasting in Video Semantic Segmentation

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning