Text Similarity in Vector Space Models: A Comparative Study
Related publications (52)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is a ...
In this thesis, we explore the use of machine learning techniques for information retrieval. More specifically, we focus on ad-hoc retrieval, which is concerned with searching large corpora to identify the documents relevant to user queries. This identific ...
In this paper, we question the homogeneity of a large parallel corpus by measuring the similarity between various sub-parts. We compare results obtained using a general measure of lexical similarity based on c2 and by counting the number of discourse conne ...
The Automatic Content Linking Device is a just-in-time document retrieval system which monitors an ongoing conversation or a monologue and enriches it with potentially related documents, including multimedia ones, from local repositories or from the Intern ...
ABSTRACT. An information-geometric approach for document similarities in the framework of “Probabilistic Latent Semantic Indexing” was first proposed by T. Hofmann (2000) and later extended (“revisited”) by Nyffenegger et al. (2006). This paper presents an ...
The Probabilistic Latent Semantic indexing model, introduced by T. Hofmann (1999), has engendered applications ill numerous fields, notably document classification and information retrieval. In this context, the Fisher kernel was found to be an appropriate ...
Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa2009
In this thesis, we explore the use of machine learning techniques for information retrieval. More specifically, we focus on ad-hoc retrieval, which is concerned with searching large corpora to identify the documents relevant to user queries. Thisidentifica ...
We first present our work in machine translation, during which we used aligned sentences to train a neural network to embed n-grams of different languages into an d-dimensional space, such that n-grams that are the translation of each other are close with ...
Tasks that rely on semantic content of documents, notably Information Retrieval and Document Classification, can benefit from a good account of document context, i.e. the semantic association between documents. To this effect, the scheme of latent semantic ...
In this thesis, we explore the use of machine learning techniques for information retrieval. More specifically, we focus on ad-hoc retrieval, which is concerned with searching large corpora to identify the documents relevant to user queries. This identific ...