Publication

PLSI: The True Fisher Kernel and beyond IID Processes, Information Matrix and Model Identification in PLSI

Publications associées (40)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Robustness, replicability and scalability in topic modelling

Orion B Penner

Approaches for estimating the similarity between individual publications are an area of long -standing interest in the scientometrics and informetrics communities. Traditional techniques have generally relied on references and other metadata, while text mi ...

ELSEVIER2022

Evolution of Topics and Novelty in Science

Orion B Penner

Methods of estimating the similarity between individual publications is an area of long-standing interest in the scientometrics community. Traditional methods have generally relied on references and other metadata, while text mining approaches based on tit ...

INT SOC SCIENTOMETRICS & INFORMETRICS-ISSI2019

Text Similarity in Vector Space Models: A Comparative Study

Kenneth Younge, Omid Shahmirzadi, Adam Lugowski

Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to ...

2018

Building Word Embeddings for Solving Natural Language Processing

Rémi Philippe Lebret

Word embedding is a feature learning technique which aims at mapping words from a vocabulary into vectors of real numbers in a low-dimensional space. By leveraging large corpora of unlabeled text, such continuous space representations can be computed for c ...

École Polytechnique Fédérale de Lausanne2016

Word Embeddings for Natural Language Processing

Rémi Philippe Lebret

EPFL2016

Adaptive relevance feedback for large-scale image retrieval

François Fleuret, Nicolae Suditu

Content-based image retrieval aims at substituting traditional indexing based on manual annotation by using automatically-extracted visual indexing features. Novel techniques are needed however to efficiently deal with the semantic gap (i.e. the partial ma ...

2016

N-gram-Based Low-Dimensional Representation for Document Classification

Rémi Philippe Lebret, Ronan Collobert

The bag-of-words (BOW) model is the common approach for classifying documents, where words are used as feature for training a classifier. This generally involves a huge number of features. Some techniques, such as Latent Semantic Analysis (LSA) or Latent D ...

2015

Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Andrei Popescu-Belis, Majid Yazdani

We propose a method for computing semantic relatedness between words or texts by using knowledge from hypertext encyclopedias such as Wikipedia. A network of concepts is built by filtering the encyclopedia's articles, each concept corresponding to an artic ...

Elsevier Science Bv2013

Query Optimization in Context of Pseudo Relevant Documents

Ashish Kishore Bindal

In conventional vector space model for information retrieval, query vector generation is imperfect for retrieval of precise documents which are de-sired by user. In this paper, we present a stochastic based approach for optimiz-ing query vector without use ...

2012

Tag Recommendation for Large-Scale Ontology-Based Information Systems

Karl Aberer, Alexey Boyarsky, Oleg Ruchayskiy, Philippe Cudré-Mauroux, Roman Prokofyev

We tackle the problem of improving the relevance of automatically selected tags in large-scale ontology-based information systems. Contrary to traditional settings where tags can be chosen arbitrarily, we focus on the problem of recommending tags (e.g., co ...

Springer2012