Publication

Application of Information Retrieval Techniques to Single Writer Documents

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Application of Information Retrieval Techniques to Single Writer Documents

Alessandro Vinciarelli

This work shows Information Retrieval experiments performed over handwritten documents produced by a single writer. The same retrieval task has been performed over both manual (no errors) and automatic (Word Error Rate around 45%) transcriptions of 200 han ...

2005

Noisy Text Categorization

Alessandro Vinciarelli

This work presents categorization experiments performed over noisy texts. By noisy it is meant any text obtained through an extraction process (affected by errors) from media other than digital texts (e.g. transcriptions of speech recordings extracted with ...

2005

Inferring Document Similarity from Hyperlinks

Samy Bengio, David Grangier

Assessing semantic similarity between text documents is a crucial aspect in Information Retrieval systems. In this work, we propose to use hyperlink information to derive a similarity measure that can then be applied to compare any text documents, with or ...

2005

Inferring Document Similarity from Hyper-links

Samy Bengio, David Grangier

Assessing semantic similarity between text documents is a crucial aspect in Information Retrieval systems. In this paper, we propose a technique to derive a similarity measure from hyper-link information. As linked documents are generally semantically clos ...

IDIAP2005

Locally Testable Codes

Mahdi Cheraghchi Bashi Astaneh

Error correcting codes are combinatorial objects that allow reliable recovery of information in presence of errors by cleverly augmenting the original information with a certain amount of redundancy. The availability of efficient means of error detection i ...

2005

Exploiting Hyperlinks to Learn a Retrieval Model

Samy Bengio, David Grangier

Information Retrieval (IR) aims at solving a ranking problem: given a query

q

and a corpus

C

, the documents of

C

should be ranked such that the documents relevant to

q

appear above the others. This task is generally performed by ranking the documen ...

2005

Improving Speech Recognition Using a Data-Driven Approach

Hervé Bourlard, Guillermo Aradilla

In this paper, we investigate the possibility of enhancing state-of-the-art HMM-based speech recognition systems using data-driven techniques, where whole set of training utterances is used as reference models and recognition is then performed through the ...

IDIAP2005

Improving Speech Recognition Using a Data-Driven Approach

Hervé Bourlard, Guillermo Aradilla

2005

On the Use of Information Retrieval Measures for Speech Recognition Evaluation

Hervé Bourlard, Daniel Gatica-Perez, John David Scott Dines, Darren Moore

This paper discusses the evaluation of automatic speech recognition (ASR) systems developed for practical applications, suggesting a set of criteria for application-oriented performance measures. The commonly used word error rate (WER), which poses ASR eva ...

IDIAP2004

Effect of Recognition Errors on Information Retrieval Performance

Alessandro Vinciarelli

This work shows experiments on the retrieval of handwritten documents. The performance of the same state-of-the-art Information Retrieval system is compared when dealing with manual (no errors) and automatic (Word Error Rate around 50%) transcriptions of t ...

IDIAP2004

Exploiting Hyperlinks to Learn a Retrieval Model

Samy Bengio, David Grangier

Information Retrieval (IR) aims at solving a ranking problem: given a query

q

and a corpus

C

, the documents of

C

should be ranked such that the documents relevant to

q

appear above the others. This task is generally performed by ranking the documen ...

2005