Publication

Prompt–RSVQA: Prompting visual context to a language model for Remote Sensing Visual Question Answering

Related publications (49)

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Mattia Atzeni

The ability to reason, plan and solve highly abstract problems is a hallmark of human intelligence. Recent advancements in artificial intelligence, propelled by deep neural networks, have revolutionized disciplines like computer vision and natural language ...
EPFL2024

Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning

Vinitra Swamy, Jibril Albachir Frej, Paola Mejia Domenzain, Luca Zunino, Tommaso Martorella, Elena Grazia Gado

Intelligent Tutoring Systems (ITS) enhance personalized learning by predicting student answers to provide immediate and customized instruction. However, recent research has primarily focused on the correctness of the answer rather than the student's perfor ...
2024

Text Representation Learning for Low Cost Natural Language Understanding

Jan Frederik Jonas Florian Mai

Natural language processing and other artificial intelligence fields have witnessed impressive progress over the past decade. Although some of this progress is due to algorithmic advances in deep learning, the majority has arguably been enabled by scaling ...
EPFL2023

Improving Generalization of Pretrained Language Models

Rabeeh Karimi Mahabadi

In this dissertation, we propose multiple methods to improve transfer learning for pretrained language models (PLMs). Broadly, transfer learning is a powerful technique in natural language processing, where a language model is first pre-trained on a data-r ...
EPFL2023

Deep Generative Models for Autonomous Driving: from Motion Forecasting to Realistic Image Synthesis

Saeed Saadatnejad

Forecasting is a capability inherent in humans when navigating. Humans routinely plan their paths, considering the potential future movements of those around them. Similarly, to achieve comparable sophistication and safety, autonomous systems must embrace ...
EPFL2023

Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models

Karl Aberer, Rémi Philippe Lebret, Mohammadreza Banaei

Recent transformer language models achieve outstanding results in many natural language processing (NLP) tasks. However, their enormous size often makes them impractical on memory-constrained devices, requiring practitioners to compress them to smaller net ...
Assoc Computational Linguistics-Acl2023

Multi-task prompt-RSVQA to explicitly count objects on aerial images

Devis Tuia, Sylvain Lobry, Christel Marie Tartini-Chappuis, Javiera Francisca Castillo Navarro, Nicola Antonio Santacroce

Introduced to enable a wider use of Earth Observation images using natural language, Remote Sensing Visual Question Answering (RSVQA) remains a challenging task, in particular for questions related to counting. To address this specific challenge, we propos ...
2023

Backpropagation-free training of deep physical neural networks

Romain Christophe Rémy Fleury, Ali Momeni, Matthieu Francis Malléjac, Babak Rahmani, Marc Philipp Del Hougne

Recent successes in deep learning for vision and natural language processing are attributed to larger models but come with energy consumption and scalability issues. Current training of digital deep-learning models primarily relies on backpropagation that ...
2023

Framing the News: From Human Perception to Large Language Model Inferences

Daniel Gatica-Perez

Identifying the frames of news is important to understand the articles' vision, intention, message to be conveyed, and which aspects of the news are emphasized. Framing is a widely studied concept in journalism, and has emerged as a new topic in computing, ...
New York2023

Language Transformers for Remote Sensing Visual Question Answering

Devis Tuia, Sylvain Lobry, Christel Marie Tartini-Chappuis, Vincent Alexandre Mendez

Remote sensing visual question answering (RSVQA) opens new avenues to promote the use of satellites data, by interfacing satellite image analysis with natural language processing. Capitalizing on the remarkable advances in natural language processing and c ...
2022

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.