Evaluating Attention Networks for Anaphora Resolution

Andrei Popescu-Belis, Nikolaos Pappas, Lesly Sadiht Miculicich Werlen
2017
Rapport ou document de travail

Résumé

In this paper, we evaluate the results of using inter and intra attention mechanisms from two architectures, a Deep Attention Long Short-Term Memory-Network (LSTM-N) (Cheng et al., 2016) and a Decomposable Attention model (Parikh et al., 2016), for anaphora resolution, i.e. detecting coreference relations between a pronoun and a noun (its antecedent). The models are adapted from an entailment task, to address the pronominal coreference resolution task by comparing two pairs of sentences: one with the original sentences containing the antecedent and the pronoun, and another one with the pronoun replaced with a correct or an incorrect antecedent. The goal is thus to detect the correct replacements, assuming the original sentence pair entails the one with the correct replacement, but not one with an incorrect replacement. We use the CoNLL-2012 English dataset (Pradhan et al., 2012) to train the models and evaluate the ability to recognize correct and incorrect pronoun replacements in sentence pairs. We find that the Decomposable Attention Model performs better, while using a much simpler architecture. Furthermore, we focus on two previous studies that use intra- and inter-attention mechanisms, discuss how they relate to each other, and examine how these advances work to identify correct antecedent replacements.

Source officielle

https://infoscience.epfl.ch/record/231846?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Evaluating Attention Networks for Anaphora Resolution

Graph Chatbot

Chattez avec Graph Search

Coupling a recurrent neural network to SPAD TCSPC systems for real-time fluorescence lifetime imaging

Transformer Models for Vision

Linear Complexity Self-Attention With 3rd Order Polynomials

Coupling a recurrent neural network to SPAD TCSPC systems for real-time fluorescence lifetime imaging

Transformer Models for Vision

Linear Complexity Self-Attention With 3rd Order Polynomials