Publication

Intégration de connaissances syntaxiques et sémantiques dans les représentations vectorielles de textes

Related publications (60)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Idiap Abstract Text Summarization System for German Text Summarization Task

Petr Motlicek

Text summarization is considered as a challenging task in the NLP community. The availability of datasets for the task of multilingual text summarization is rare, and such datasets are difficult to construct. In this work, we build an abstract text summari ...

Idiap2020

Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task

Karl Aberer, Rémi Philippe Lebret, Alireza Mohammadshahi

In this paper, we propose a new approach to learn multimodal multilingual embeddings for matching images and their relevant captions in two languages. We combine two existing objective functions to make images and captions close in a joint embedding space ...

Association for Computational Linguistics2019

Crosslingual Document Embedding as Reduced-Rank Ridge Regression

Martin Jaggi, Robert West, Martin Josifoski, Ivan Paskov

There has recently been much interest in extending vector-based word representations to multiple languages, such that words can be compared across languages. In this paper, we shift the focus from words to documents and introduce a method for embedding doc ...

2019

Transformer-Based Multi-lingual Sentence Embeddings

In this thesis, we present a transformers-based multi-lingual embedding model to represent sentences in different languages in a common space. To do so, our system uses the structure of a simplified transformer with a shared byte-pair encoding vocabulary f ...

2019

Named Entity Processing for Historical Texts

Maud Ehrmann, Matteo Romanello

Recognition and identification of real-world entities is at the core of virtually any text mining application. As a matter of fact, referential units such as names of persons, locations and organizations underlie the semantics of texts and guide their inte ...

2019

Unsupervised Learning of Representations for Lexical Entailment Detection

Andreas Hug

Detecting lexical entailment plays a fundamental role in a variety of natural language processing tasks and is key to language understanding. Unsupervised methods still play an important role due to the lack of coverage of lexical databases in some domains ...

2018

Word Sense Consistency in Statistical and Neural Machine Translation

Xiao Pu

Different senses of source words must often be rendered by different words in the target language when performing machine translation (MT). Selecting the correct translation of polysemous words can be done based on the contexts of use. However, state-of-th ...

EPFL2018

Text Similarity in Vector Space Models: A Comparative Study

Kenneth Younge, Omid Shahmirzadi, Adam Lugowski

Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to ...

2018

Learning Word Vectors for 157 Languages

Prakhar Gupta, Edouard Grave

Distributed word representations, or word vectors, have recently been applied to many tasks in natural language processing, leading to state-of-the-art performance. A key ingredient to the successful application of these representations is to train them on ...

2018

Building Word Embeddings for Solving Natural Language Processing

Rémi Philippe Lebret

Word embedding is a feature learning technique which aims at mapping words from a vocabulary into vectors of real numbers in a low-dimensional space. By leveraging large corpora of unlabeled text, such continuous space representations can be computed for c ...

École Polytechnique Fédérale de Lausanne2016