Prompt–RSVQA: Prompting visual context to a language model for Remote Sensing Visual Question Answering

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Remote sensing visual question answering (RQA) was recently proposed with the aim of interfacing natural language and vision to ease the access of information contained in Earth Observation data for a wide audience, which is granted by simple questions in natural language. The traditional vision/language interface is an embedding obtained by fusing features from two deep models, one processing the image and another the question. Despite the success of early VQA models, it remains difficult to control the adequacy of the visual information extracted by its deep model, which should act as a context regularizing the work of the language model. We propose to extract this context information with a visual model, convert it to text and inject it, i.e. prompt it, into a language model. The language model is therefore responsible to process the question with the visual context, and extract features, which are useful to find the answer. We study the effect of prompting with respect to a black-box visual extractor and discuss the importance of training a visual model producing accurate context.

Prompt–RSVQA: Prompting visual context to a language model for Remote Sensing Visual Question Answering

Graph Chatbot

Chattez avec Graph Search

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Course Design - A Visual and Modular Approach

Social-Transmotion: Promptable Human Trajectory Prediction

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Social-Transmotion: Promptable Human Trajectory Prediction

Course Design - A Visual and Modular Approach