Concept

Entity linking

Related publications (107)

A Text Mining Pipeline Using Active and Deep Learning Aimed at Curating Information in Computational Neuroscience

The curation of neuroscience entities is crucial to ongoing efforts in neuroinformatics and computational neuroscience, such as those being deployed in the context of continuing large-scale brain modelling projects. However, manually sifting through thousa ...

Springer2019

Transfer Learning from Pre-trained BERT for Pronoun Resolution

Qianqian Qiao, Xingce Bao

The paper describes the submission of the team "We used bert!" to the shared task Gendered Pronoun Resolution (Pair pronouns to their correct entities). Our final submission model based on the fine-tuned BERT (Bidirectional Encoder Representations from Tra ...

ASSOC COMPUTATIONAL LINGUISTICS-ACL2019

FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents

Jean-Philippe Thiran, Hazim Kemal Ekenel, Guillaume Jaume

We present a new dataset for form understanding in noisy scanned documents (FUNSD) that aims at extracting and structuring the textual content of forms. The dataset comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely ...

IEEE2019

Machine learning for cross-gazetteer matching of natural features

Elise Acheson

Defining and identifying duplicate records in a dataset is a challenging task which grows more complex when the modeled entities themselves are hard to delineate. In the geospatial domain, it may not be clear where a mountain, stream, or valley ends and be ...

2019

Transformer-Based Multi-lingual Sentence Embeddings

In this thesis, we present a transformers-based multi-lingual embedding model to represent sentences in different languages in a common space. To do so, our system uses the structure of a simplified transformer with a shared byte-pair encoding vocabulary f ...

2019

Learning to Create Sentence Semantic Relation Graphs for Multi-Document Summarization

Boi Faltings, Diego Matteo Antognini

Linking facts across documents is a challenging task, as the language used to express the same information in a sentence can vary significantly, which complicates the task of multi-document summarization. Consequently, existing approaches heavily rely on h ...

2019

Weakly-Supervised Concept-based Adversarial Learning for Cross-lingual Word Embeddings

James Henderson

Distributed representations of words which map each word to a continuous vector have proven useful in capturing important linguistic information not only in a single language but also across different languages. Current unsupervised adversarial approaches ...

Association for Computational Linguistics2019

Automatic information extraction from historical collections: the case of the 1808 venetian cadaster

Sofia Ares Oliveira

The presentation reports on the on-going work to automatically process heterogeneous historical documents. After a quick overview of the general processing pipeline, a few examples are more comprehensively described. The recent progress in making large col ...

2018

Simple Unsupervised Keyphrase Extraction using Sentence Embeddings

Martin Jaggi, Claudiu-Cristian Musat, Kamil Bennani-Smires

Keyphrase extraction is the task of automatically selecting a small set of phrases that best describe a given free text document. Keyphrases can be used for indexing, searching, aggregating and summarizing text documents, serving many automatic as well as ...

2018

Word Sense Consistency in Statistical and Neural Machine Translation

Xiao Pu

Different senses of source words must often be rendered by different words in the target language when performing machine translation (MT). Selecting the correct translation of polysemous words can be done based on the contexts of use. However, state-of-th ...

EPFL2018

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.