Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Visual Question Answering is a new task that can facilitate the extraction of information from images through textual queries: it aims at answering an open-ended question formulated in natural language about a given image. In this work, we introduce a new dataset to tackle the task of visual question answering on remote sensing images: this large-scale, open access dataset extracts image/question/answer triplets from the BigEarthNet dataset. This new dataset contains close to 15 millions samples and is openly available. We present the dataset construction procedure, its characteristics and first results using a deep-learning based methodology. These first results show that the task of visual question answering is challenging and opens new interesting research avenues at the interface of remote sensing and natural language processing. The dataset and the code to create and process it are open and freely available on https://rsvqa.sylvainlobry.com/
Karl Aberer, Rémi Philippe Lebret, Mohammadreza Banaei
Devis Tuia, Sylvain Lobry, Christel Marie Tartini-Chappuis, Javiera Francisca Castillo Navarro, Nicola Antonio Santacroce