Word Sense Consistency in Statistical and Neural Machine Translation
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to ...
There has recently been much interest in extending vector-based word representations to multiple languages, such that words can be compared across languages. In this paper, we shift the focus from words to documents and introduce a method for embedding doc ...
Linking facts across documents is a challenging task, as the language used to express the same information in a sentence can vary significantly, which complicates the task of multi-document summarization. Consequently, existing approaches heavily rely on h ...
The design process is a series of endeavors aimed to solve a problem, which is extensively defined as a need, a task, a situation, etc. Visualizations are objects that result from a design process that solves a problem of unreadable data. They are technica ...
In this thesis, we present a transformers-based multi-lingual embedding model to represent sentences in different languages in a common space. To do so, our system uses the structure of a simplified transformer with a shared byte-pair encoding vocabulary f ...
Speech-to-speech translation is a framework which recognises speech in an input language, translates it to a target language and synthesises speech in this target language. In such a system, variations in the speech signal which are inherent to natural hum ...
Character-level Neural Machine Translation(NMT) models have recently achieved impressive results on many language pairs. They mainly do well for Indo-European language pairs, where the languages share the same writing system. However, for translating betwe ...
This article investigates the TED digital infrastructure for translating Science and Technology projects. Specifically, we seek to investigate the TED infrastructures generative mechanisms and their relation to valuation mechanisms and practices. The resea ...
Moroccan Darija is a variant of Arabic with many influences. Using the Open Multilingual WordNet (OMW), we compare the lemmas in the Moroccan Darija Wordnet (MDW) with the standard Arabic, French and Spanish ones. We then compared the lemmas in each synset ...
How can we effectively develop speech technology for languages where no transcribed data is available? Many existing approaches use no annotated resources at all, yet it makes sense to leverage information from large annotated corpora in other languages, f ...