Publication

Quality-aware similarity assessment for entity matching in Web data

Publications associées (72)

From Index Locorum to Citation Network: an Approach to the Automatic Extraction of Canonical References and its Applications to the Study of Classical Texts

My research focusses on the automatic extraction of canonical references from publications in Classics. Such references are the standard way of citing classical texts and are found in great numbers throughout monographs, journal articles and commentaries. ...

King's College London2015

Information Extraction on the Web with Credibility Guarantee

Thành Tâm Nguyên

The Web became the central medium for valuable sources of information extraction applications. However, such user-generated resources are often plagued by inaccuracies and misinformation due to the inherent openness and uncertainty of the Web. In this work ...

2015

B-hist: Entity-centric search over personal web browsing history

Karl Aberer, Philippe Cudré-Mauroux, Michele Catasta, Jean-Eudes Marie Ranvier

Web Search is increasingly entity centric; as a large fraction of common queries target specific entities, search results get progressively augmented with semi-structured and multimedia information about those entities. However, search over personal web br ...

Elsevier2014

Studying Web Content Credibility by Social Simulation

Karl Aberer

The Internet has become an important source of information that significantly affects social, economical and political life. The content available in the Web is the basis for the operation of the digital economy. Moreover, Web content has become essential ...

University of Surrey, Department of Sociology2014

UCNEbase – a database of ultra-conserved non-coding elements and genomic regulatory blocks

Philipp Bucher, Slavica Dimitrieva Janeva

UCNEbase (http://ccg.vital-it.ch/UCNEbase) is a free, web-accessible information resource on the evolution and genomic organization of ultra-conserved non-coding elements (UCNEs). It currently covers 4351 such elements in 18 different species. The majority ...

Oxford University Press2013

Distinguishing the Popularity Between Topics: A System for Up-to-date Opinion Retrieval and Mining in the Web

Nikolaos Pappas

The constantly increasing amount of opinionated texts found in the Web had a significant impact in the development of sentiment analysis. So far, the majority of the comparative studies in this field focus on analyzing fixed (offline) collections from cert ...

ACM2013

Extracting Informative Textual Parts from Web Pages Containing User-Generated Content

Nikolaos Pappas

The vast amount of user-generated content on the Web has increased the need for handling the problem of automatically processing content in web pages. The segmentation of web pages and noise (non-informative segment) removal are important pre-processing st ...

ACM2012

An Agent-Based Focused Crawling Framework for Topic- and Genre-Related Web Document Discovery

Nikolaos Pappas

The discovery of web documents about certain topics is an important task for web-based applications including web document retrieval, opinion mining and knowledge extraction. In this paper, we propose an agent-based focused crawling framework able to retri ...

IEEE2012

Some Things You Always Wanted to Know About Web Pages (But Were Too Busy to Ask)

Willy Zwaenepoel, Simon Schubert

The organic growth of the web has led to web sites that exhibit a large variety of properties. We conduct a large- scale study to gain quantitative insights into the browser-side effects of the structure and behavior of thousands of the most popular web si ...

2012

A Decentralized Recommender System for Effective Web Credibility Assessment

Karl Aberer, Alexandra Olteanu, Jean-Eudes Marie Ranvier

An overwhelming and growing amount of data is available online. The problem of untrustworthy online information is augmented by its high economic potential and its dynamic nature, e.g. transient domain names, dynamic content, etc. In this paper, we address ...

2012

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search