Publication

Web Text Retrieval with a P2P Query-Driven Index

Related publications (53)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Query-driven indexing in large-scale distributed systems

Gleb Skobeltsyn

Efficient and effective search in large-scale data repositories requires complex indexing solutions deployed on a large number of servers. Web search engines such as Google and Yahoo! already rely upon complex systems to be able to return relevant query re ...

EPFL2009

Understanding the Web

Eda Baykan

The World Wide Web is one of the most widely used information resources. Understanding the web better will enable us to benefit more of it. In this thesis we develop techniques to learn the properties of the web pages like language and topic using only the ...

EPFL2009

Machine learning for information retrieval

David Grangier

In this thesis, we explore the use of machine learning techniques for information retrieval. More specifically, we focus on ad-hoc retrieval, which is concerned with searching large corpora to identify the documents relevant to user queries. This identific ...

EPFL2008

Machine Learning for Information Retrieval

David Grangier

École Polytechnique Fédérale de Lausanne2008

Machine Learning for Information Retrieval

David Grangier

IDIAP2008

Perspectives for Rank Aggregation within Scientific Publication Databases

Martin Rajman, Martin Veselý

Ranking in scientific publication databases involves a variety of additional resources that are usually not applied in standard general purpose search engines. Moreover, community-specific expectations of users influence the perception of the adequacy of r ...

2008

Using Bibliographic Knowledge for Ranking in Scientific Publication Databases

Martin Rajman, Martin Veselý

Document ranking for scientific publications involves a variety of specialized resources (e.g. author or citation indexes) that are usually difficult to use within standard general purpose search engines that usually operate on large-scale heterogeneous do ...

IOS2008

AlvisP2P: Scalable Peer-to-Peer Text Retrieval in a Structured P2P Network

Karl Aberer, Martin Rajman, Vinh Toan Luu, Ivana Podnar, Fabius Klemm, Gleb Skobeltsyn

In this paper we present the AlvisP2P IR engine, which enables efficient retrieval with multi-keyword queries from a global document collection available in a P2P network. In such a network, each peer publishes its local index and invests a part of its loc ...

2008

Query-Driven Indexing for Peer-to-Peer Text Retrieval

Karl Aberer, Martin Rajman, Vinh Toan Luu, Ivana Podnar, Gleb Skobeltsyn

We describe a query-driven indexing framework for scalable text retrieval over structured P2P networks. To cope with the bandwidth consumption problem that has been identified as the major obstacle for full-text retrieval in P2P networks, we truncate posti ...

2007

Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys

Karl Aberer, Martin Rajman, Vinh Toan Luu, Ivana Podnar, Fabius Klemm

The suitability of peer-to-peer (P2P) approaches for full-text Web retrieval has recently been questioned because of the claimed unacceptable bandwidth consumption induced by retrieval from very large document collections. In this contribution we formalize ...

IEEE2007