Kamusi Pre:D – Lexicon-based source-side predisambiguation for MT and other text processing applications
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
We tackle the problem of disambiguating entities on the Web. We propose a user-driven scheme where graphs of entities -- represented by globally identifiable declarative artifacts -- self-organize in a dynamic and probabilistic manner. Our solution has the ...
The work presented in this thesis deals with several problems met in information retrieval (IR), task which one can summarise as identifying, in a collection of "documents", a subset of documents carrying a sought information, i.e.. relevant for a request ...
Cursive character recognition is a challenging task due to high variability and intrinsic ambiguity of cursive letters. This paper presents \emph{C-Cube} (Cursive Character Challenge), a new public-domain cursive character database. \emph{C-Cube} contains ...
This article compares one-dimensional and multi-dimensional dialogue act tagsets used for automatic labeling of utterances. The influence of tagset dimensionality on tagging accuracy is first discussed theoretically, then based on empirical data from human ...
Many discourse connectives can signal several types of relations between sentences. Their automatic disambiguation, i.e. the labeling of the correct sense of each occurrence is important for discourse parsing, but could also be helpful to machine translati ...
In this paper, we question the homogeneity of a large parallel corpus by measuring the similarity between various sub-parts. We compare results obtained using a general measure of lexical similarity based on c2 and by counting the number of discourse conne ...
This document presents an overview of the mobile biometry (MOBIO) database. This document is written expressly for the face and speech organised for the 2010 International Conference on Pattern Recognition. ...
Loosely structured heterogeneous information spaces are typically created by merging data from a variety of different applications and information sources. A common problem these information spaces need to address is that various data describe the same rea ...
This paper investigates an isolated setting of the lexical substitution task of replacing words with their synonyms. In particular, we examine this problem in the setting of subtitle generation and evaluate state of the art scoring methods that predict the ...
We first present our work in machine translation, during which we used aligned sentences to train a neural network to embed n-grams of different languages into an d-dimensional space, such that n-grams that are the translation of each other are close with ...