Publication

STC-GAN: Spatio-Temporally Coupled Generative Adversarial Networks for Predictive Scene Parsing

Jie Luo, Yiyu Wang, Mengshi Qi
2020
Journal paper
Abstract

Predictive scene parsing is a task of assigning pixel-level semantic labels to a future frame of a video. It has many applications in vision-based artificial intelligent systems, e.g., autonomous driving and robot navigation. Although previous work has shown its promising performance in semantic segmentation of images and videos, it is still quite challenging to anticipate future scene parsing with limited annotated training data. In this paper, we propose a novel model called STC-GAN, Spatio- Temporally Coupled Generative Adversarial Networks for predictive scene parsing, which employ both convolutional neural networks and convolutional long short-term memory (LSTM) in the encoder-decoder architecture. By virtue of STC-GAN, both spatial layout and semantic context can be captured by the spatial encoder effectively, while motion dynamics are extracted by the temporal encoder accurately. Furthermore, a coupled architecture is presented for establishing joint adversarial training where the weights are shared and features are transformed in an adaptive fashion between the future frame generation model and predictive scene parsing model. Consequently, the proposed STC-GAN is able to learn valuable features from unlabeled video data. We evaluate our proposed STC-GAN on two public datasets, i.e., Cityscapes and CamVid. Experimental results demonstrate that our method outperforms the state-of-the-art.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.