Multilingual bottleneck features for subword modeling in zero-resource languages
Publications associées (33)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
This paper discusses Kamusi Pre:D, a system to improve translation by disambiguating word senses in a source document with reference to a large concept-based lexicon that is aligned by sense across numerous languages. Currently under active development, th ...
Medulloblastoma (MB) is a type of brain cancer that represent roughly 25% of all brain tumors in children. In the anaplastic medulloblastoma subtype, it is important to identify the degree of irregularity and lack of organizations of cells as this correla ...
Springer2015
Medulloblastoma (MB) is a type of brain cancer that represent roughly 25% of all brain tumors in children. In the anaplastic medulloblastoma subtype, it is important to identify the degree of irregularity and lack of organizations of cells as this correlat ...
This paper introduces a model of multiple-instance learning applied to the prediction of aspect ratings or judgments of specific properties of an item from user-contributed texts such as product reviews. Each variable-length text is represented by several ...
We study the task of learning to rank images given a text query, a problem that is complicated by the issue of multiple senses. That is, the senses of interest are typically the visually distinct concepts that a user wishes to retrieve. In this paper, we p ...
Europarl is a large multilingual corpus containing the minutes of the debates at the European Parliament. This article presents a method to extract different corpora from Europarl: monolingual and multilingual comparable corpora, as well as parallel corpor ...
Word embeddings resulting from neural language models have been shown to be a great asset for a large variety of NLP tasks. However, such architecture might be difficult and time-consuming to train. Instead, we propose to drastically simplify the word embe ...
2014
,
Recent works on word representations mostly rely on predictive models. Distributed word representations (aka word embeddings) are trained to optimally predict the contexts in which the corresponding words tend to appear. Such models have succeeded in captu ...
Recently, there has been a lot of effort to represent words in continuous vector spaces. Those representations have been shown to capture both semantic and syntactic information about words. However, distributed representations of phrases remain a challeng ...
This paper introduces a model of multiple-instance learning applied to the prediction of aspect ratings or judgments of specific properties of an item from user-contributed texts such as product reviews. Each variable-length text is represented by several ...