Publication

Thematic Indexing of Spoken Documents by Using Self-Organizing Maps

Publications associées (59)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Examining European Press Coverage of the Covid-19 No-Vax Movement: An NLP Framework

Daniel Gatica-Perez

This paper examines how the European press dealt with the no-vax reactions against the Covid-19 vaccine and the dis- and misinformation associated with this movement. Using a curated dataset of 1786 articles from 19 European newspapers on the anti-vaccine ...

ASSOC COMPUTING MACHINERY2023

Robustness, replicability and scalability in topic modelling

Orion B Penner

Approaches for estimating the similarity between individual publications are an area of long -standing interest in the scientometrics and informetrics communities. Traditional techniques have generally relied on references and other metadata, while text mi ...

ELSEVIER2022

Quote Erat Demonstrandum: AWeb Interface for Exploring the Quotebank Corpus

Robert West, Akhil Arora, Andreas Oliver Spitz, Huan-Cheng Chang, Vuk Vukovic

The use of attributed quotes is the most direct and least filtered pathway of information propagation in news. Consequently, quotes play a central role in the conception, reception, and analysis of news stories. Since quotes provide a more direct window in ...

ASSOC COMPUTING MACHINERY2022

Further results on latent discourse models and word embeddings

Youssef Allouah

We discuss some properties of generative models for word embeddings. Namely, (Arora et al., 2016) proposed a latent discourse model implying the concentration of the partition function of the word vectors. This concentration phenomenon led to an asymptotic ...

MICROTOME PUBL2021

Self-Supervised Neural Topic Modeling

Martin Jaggi

Topic models are useful tools for analyzing and interpreting the main underlying themes of large corpora of text. Most topic models rely on word co-occurrence for computing a topic, i.e., a weighted set of words that together represent a high-level semanti ...

Assoc Computational Linguistics-Acl2021

Crosslingual Document Embedding as Reduced-Rank Ridge Regression

Martin Jaggi, Robert West, Martin Josifoski, Ivan Paskov

There has recently been much interest in extending vector-based word representations to multiple languages, such that words can be compared across languages. In this paper, we shift the focus from words to documents and introduce a method for embedding doc ...

2019

Beyond Keyword Search: Semantic Indexing and Exploration of Large Collections of Historical Newspapers

Maud Ehrmann

For long held on library and archive shelving, historical newspapers are currently undergoing mass digitization and millions of facsimiles, along with their machine-readable content acquired via Optical Character Recognition, are becoming accessible via a ...

2019

Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task

Karl Aberer, Rémi Philippe Lebret, Alireza Mohammadshahi

In this paper, we propose a new approach to learn multimodal multilingual embeddings for matching images and their relevant captions in two languages. We combine two existing objective functions to make images and captions close in a joint embedding space ...

Association for Computational Linguistics2019

Evolution of Topics and Novelty in Science

Orion B Penner

Methods of estimating the similarity between individual publications is an area of long-standing interest in the scientometrics community. Traditional methods have generally relied on references and other metadata, while text mining approaches based on tit ...

INT SOC SCIENTOMETRICS & INFORMETRICS-ISSI2019

New Multi-Keyword Ciphertext Search Method for Sensor Network Cloud Platforms

Jiyong Zhang, Yue Wang, Hongyu Yang

This paper proposed a multi-keyword ciphertext search, based on an improved-quality hierarchical clustering (MCS-IQHC) method. MCS-IQHC is a novel technique, which is tailored to work with encrypted data. It has improved search accuracy and can self-adapt ...

MDPI2018