Publication

Rehabilitation of Count-based Models for Word Vector Representations

Related publications (39)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Evolution of Topics and Novelty in Science

Orion B Penner

Methods of estimating the similarity between individual publications is an area of long-standing interest in the scientometrics community. Traditional methods have generally relied on references and other metadata, while text mining approaches based on tit ...

INT SOC SCIENTOMETRICS & INFORMETRICS-ISSI2019

Word Sense Consistency in Statistical and Neural Machine Translation

Xiao Pu

Different senses of source words must often be rendered by different words in the target language when performing machine translation (MT). Selecting the correct translation of polysemous words can be done based on the contexts of use. However, state-of-th ...

EPFL2018

Text Similarity in Vector Space Models: A Comparative Study

Kenneth Younge, Omid Shahmirzadi, Adam Lugowski

Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to ...

2018

Unsupervised Learning of Representations for Lexical Entailment Detection

Andreas Hug

Detecting lexical entailment plays a fundamental role in a variety of natural language processing tasks and is key to language understanding. Unsupervised methods still play an important role due to the lack of coverage of lexical databases in some domains ...

2018

Simple Unsupervised Keyphrase Extraction using Sentence Embeddings

Martin Jaggi, Claudiu-Cristian Musat, Kamil Bennani-Smires

Keyphrase extraction is the task of automatically selecting a small set of phrases that best describe a given free text document. Keyphrases can be used for indexing, searching, aggregating and summarizing text documents, serving many automatic as well as ...

2018

Learning Word Vectors for 157 Languages

Prakhar Gupta, Edouard Grave

Distributed word representations, or word vectors, have recently been applied to many tasks in natural language processing, leading to state-of-the-art performance. A key ingredient to the successful application of these representations is to train them on ...

2018

Multilingual bottleneck features for subword modeling in zero-resource languages

Enno Hermann

How can we effectively develop speech technology for languages where no transcribed data is available? Many existing approaches use no annotated resources at all, yet it makes sense to leverage information from large annotated corpora in other languages, f ...

2018

Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features

Martin Jaggi, Matteo Pagliardini, Prakhar Gupta

The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We present a simpl ...

2017

Building Word Embeddings for Solving Natural Language Processing

Rémi Philippe Lebret

Word embedding is a feature learning technique which aims at mapping words from a vocabulary into vectors of real numbers in a low-dimensional space. By leveraging large corpora of unlabeled text, such continuous space representations can be computed for c ...

École Polytechnique Fédérale de Lausanne2016

Word Embeddings for Natural Language Processing

Rémi Philippe Lebret

EPFL2016