Robustness, replicability and scalability in topic modelling
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
The bag-of-words (BOW) model is the common approach for classifying documents, where words are used as feature for training a classifier. This generally involves a huge number of features. Some techniques, such as Latent Semantic Analysis (LSA) or Latent D ...
This report presents a study on assisting users in building queries to perform real-time searches in a news and social media monitoring system. The system accepts complex queries, and we assist the user by suggesting related keywords or entities. We do thi ...
Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to ...
In this thesis, we propose novel solutions to similarity learning problems on collaborative networks. Similarity learning is essential for modeling and predicting the evolution of collaborative networks. In addition, similarity learning is used to perform ...
Ever since the links between the development of new technologies and economic growth became evident, researchers have attempted to study how the creation of knowledge fosters progress. If pushing the frontier of knowledge has an impact on progress and well ...
Word embedding is a feature learning technique which aims at mapping words from a vocabulary into vectors of real numbers in a low-dimensional space. By leveraging large corpora of unlabeled text, such continuous space representations can be computed for c ...
Word embedding is a feature learning technique which aims at mapping words from a vocabulary into vectors of real numbers in a low-dimensional space. By leveraging large corpora of unlabeled text, such continuous space representations can be computed for c ...
Methods of estimating the similarity between individual publications is an area of long-standing interest in the scientometrics community. Traditional methods have generally relied on references and other metadata, while text mining approaches based on tit ...
While social data is being widely used in various applications such as sentiment analysis and trend prediction, its sheer size also presents great challenges for storing, sharing and processing such data. These challenges can be addressed by data summariza ...
In this thesis, we address the analysis of activities from long term data logs with an emphasis on video recordings. Starting from simple words from video, we progressively build methods to infer higher level scene semantics. The main strategies used to ac ...