Publication

Transformer-Based Multi-lingual Sentence Embeddings

Related publications (33)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Better Word Embeddings by Disentangling Contextual n-Gram Information

Martin Jaggi, Matteo Pagliardini, Prakhar Gupta

Pre-trained word vectors are ubiquitous in Natural Language Processing applications. In this paper, we show how training word embeddings jointly with bigram and even trigram embeddings, results in improved unigram embeddings. We claim that training word em ...

2019

Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task

Karl Aberer, Rémi Philippe Lebret, Alireza Mohammadshahi

In this paper, we propose a new approach to learn multimodal multilingual embeddings for matching images and their relevant captions in two languages. We combine two existing objective functions to make images and captions close in a joint embedding space ...

Association for Computational Linguistics2019

Learning Word Vectors for 157 Languages

Prakhar Gupta, Edouard Grave

Distributed word representations, or word vectors, have recently been applied to many tasks in natural language processing, leading to state-of-the-art performance. A key ingredient to the successful application of these representations is to train them on ...

2018

Word Sense Consistency in Statistical and Neural Machine Translation

Xiao Pu

Different senses of source words must often be rendered by different words in the target language when performing machine translation (MT). Selecting the correct translation of polysemous words can be done based on the contexts of use. However, state-of-th ...

EPFL2018

Detecting Trends in Job Advertisements

Pierre Dillenbourg, Kshitij Sharma, Khalil Mrini

We present an automatic method for trend detection in job ads. From a job-posting website, we collect job ads from 16 countries and in 8 languages and 6 job domains. We pre-process them by removing stop words, lemmatising and performing cross-domain filter ...

2017

Building Word Embeddings for Solving Natural Language Processing

Rémi Philippe Lebret

Word embedding is a feature learning technique which aims at mapping words from a vocabulary into vectors of real numbers in a low-dimensional space. By leveraging large corpora of unlabeled text, such continuous space representations can be computed for c ...

École Polytechnique Fédérale de Lausanne2016

Word Embeddings for Natural Language Processing

Rémi Philippe Lebret

EPFL2016

Word Sequence Modeling using Deep Learning

Joël Yvon Roland Legrand

For a long time, natural language processing (NLP) has relied on generative models with task specific and manually engineered features. Recently, there has been a resurgence of interest for neural networks in the machine learning community, obtaining state ...

EPFL2016

"The Sum of Its Parts": Joint Learning of Word and Phrase Representations with Autoencoders

Rémi Philippe Lebret, Ronan Collobert

Recently, there has been a lot of effort to represent words in continuous vector spaces. Those representations have been shown to capture both semantic and syntactic information about words. However, distributed representations of phrases remain a challeng ...

Idiap2015

Word Embeddings through Hellinger PCA

Rémi Philippe Lebret, Ronan Collobert

Word embeddings resulting from neural language models have been shown to be a great asset for a large variety of NLP tasks. However, such architecture might be difficult and time-consuming to train. Instead, we propose to drastically simplify the word embe ...

2014