Publication

Interpretable Representation Learning and Evaluation for Abstractive Summarization

Andreas Thomas Marfurt
2023
Thèse EPFL

Résumé

Abstractive summarization has seen big improvements in recent years, mostly due to advances in neural language modeling, language model pretraining, and scaling models and datasets. While large language models generate summaries that are fluent, coherent, and integrate the salient information from the source document well, there are still a few challenges. Most importantly, information that is either not supported by the source document (hallucinations) or factually inaccurate finds its way into the machine-written summaries. Moreover, and connected to this first point, knowledge retrieval and summary generation happen implicitly, which leads to a lack of interpretability and controllability of the models.In this thesis, we contribute to solving these problems by working on making the summarization process more interpretable, faithful, and controllable. The thesis consists of two parts. In Part I, we learn interpretable representations that help with summary structure, faithfulness, and document understanding. First, we plan summary content at the sentence level, building a next sentence representation from the summary generated so far. Second, we integrate an entailment interpretation into standard text-encoding neural network architectures. In the last chapter of the first part, we use multiple object discovery methods from computer vision to identify semantic text units that should facilitate the extraction of salient information from source documents.In Part II, we turn to the evaluation of summarization models, and also contribute annotated resources for our tasks. We start by using the attentions and probability estimates during summary generation to identify hallucinations. We then apply summarization models in a novel semi-structured setting, where the model is asked to generate an interpretation from a long source document. For this novel task, we develop an evaluation technique that allows efficient contrastive evaluation of generative models with respect to user-specified distinctions.

Source officielle

https://infoscience.epfl.ch/record/302951?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Andreas Thomas Marfurt
2023
Thèse EPFL

Résumé

Source officielle

https://infoscience.epfl.ch/record/302951?ln=fr

À propos de ce résultat

Proximité ontologique

Information engineering

Apprentissage automatique: Réseau de neurones artificiels

Traitement automatique du langage naturel: Traitement automatique du langage naturel

Concepts associés (33)

Publications associées (56)

MOOCs associés (10)

Interpretable Representation Learning and Evaluation for Abstractive Summarization

Graph Chatbot

Chattez avec Graph Search

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Toward Automatic Typography Analysis: Serif Classification and Font Similarities

Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Toward Automatic Typography Analysis: Serif Classification and Font Similarities

Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning