Publication

Distinguishing the Popularity Between Topics: A System for Up-to-date Opinion Retrieval and Mining in the Web

Related publications (35)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Towards better entity resolution techniques for Web document collections

Karl Aberer, Zoltán Miklós, Surender Reddy Yerva

As person names are non-unique, the same name on different Web pages might or might not refer to the same real-world person. This entity identification problem is one of the most challenging issues in realizing the Semantic Web or entity-oriented search. W ...

1st International Workshop on Data Engineering meets the Semantic Web (DESWeb'2010) (co-located with ICDE'2010)2010

Towards better entity resolution techniques for Web document collections

Karl Aberer, Zoltán Miklós, Surender Reddy Yerva

IEEE2010

Understanding the Web

Eda Baykan

The World Wide Web is one of the most widely used information resources. Understanding the web better will enable us to benefit more of it. In this thesis we develop techniques to learn the properties of the web pages like language and topic using only the ...

EPFL2009

Purely URL-based Topic Classification

Monika Henzinger, Ingmar Weber, Eda Baykan, Ludmila Marian

Given only the URL of a web page, can we identify its topic? This is the question that we examine in this paper. Usually, web pages are classified using their content, but a URL-only classifier is preferable, (i) when speed is crucial, (ii) to enable conte ...

2009

A Comparison of Techniques for Sampling Web Pages

Monika Henzinger, Eda Baykan

As the World Wide Web is growing rapidly, it is getting increasingly challenging to gather representative information about it. Instead of crawling the web exhaustively one has to resort to other techniques like sampling to determine the properties of the ...

2009

, ,

Given only the URL of a web page, can we identify its language? This is the question that we examine in this paper. Such a language classifier is, for example, useful for crawlers of web search engines, which frequently try to satisfy certain language quot ...

2008

This thesis describes our research results in the context of peer-to-peer information retrieval (P2P-IR). One goal in P2P-IR is to build a search engine for the World Wide Web (WWW) that runs on up to hundreds of thousands or even millions computers distri ...

EPFL2008

In our everyday life we often see objects or persons and are aware that there are related digital services such as an online ticket service when seeing a poster advertising a concert. Currently it is a rather time consuming activity to find the related inf ...

ACM Press2008

Our daily life is pervaded by digital information and devices, not least the common mobile phone. However, a seamless connection between our physical world, such as a movie trailer on a screen in the main rail station and its digital counterparts, such as ...

2008

, , ,

We consider the applicability of terms extracted from anchortext as a source of Web page descriptions in the form of tags. With a relatively simple and easy-to-use method, we show that anchortext significantly overlaps with tags obtained from the popular t ...

2008