Text as a Richer Source of Supervision in Semantic Segmentation Tasks

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

This paper introduces TACOSS a text-image alignment approach that allows explainable land cover semantic segmentation by directly integrating semantic concepts encoded from texts. TACOSS combines convolutional neural networks for visual feature extraction with semantic embeddings provided by a language model. By leveraging contrastive learning approaches, we learn an alignment between the visual and the (fixed) textual representations. In addition to producing standard semantic segmentation outputs, our model enables interactive queries with RS images using natural language prompts. The experimental results obtained on 50cm resolution aerial data from Switzerland show that TACOSS performs similarly to a standard semantic segmentation model while allowing the flexible usage of in- and out-of-vocabulary terms for the interactions with the image.

Text as a Richer Source of Supervision in Semantic Segmentation Tasks

Graph Chatbot

Chattez avec Graph Search

Fast and Future: Towards Efficient Forecasting in Video Semantic Segmentation

Aggregating Spatial and Photometric Context for Photometric Stereo

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Fast and Future: Towards Efficient Forecasting in Video Semantic Segmentation

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Aggregating Spatial and Photometric Context for Photometric Stereo