Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Document ranking for scientific publications involves a variety of specialized resources (e.g. author or citation indexes) that are usually difficult to use within standard general purpose search engines that usually operate on large-scale heterogeneous do ...
Efficient and effective search in large-scale data repositories requires complex indexing solutions deployed on a large number of servers. Web search engines such as Google and Yahoo! already rely upon complex systems to be able to return relevant query re ...
We present a query-driven algorithm for the distributed indexing of large document collections within structured P2P networks. To cope with bandwidth consumption that has been identified as the major problem for the standard P2P approach with single term i ...
Clustering similar documents is a difficult task for text data mining. Difficulties stem especially from the way documents are translated into numerical vectors. In this chapter, we will present a method that uses Self Organizing Map (SOM) to cluster medic ...
This thesis describes our research results in the context of peer-to-peer information retrieval (P2P-IR). One goal in P2P-IR is to build a search engine for the World Wide Web (WWW) that runs on up to hundreds of thousands or even millions computers distri ...
The PLSI model (“Probabilistic Latent Semantic Indexing”) offers a document indexing scheme based on probabilistic latent category models. It entailed applications in diverse fields, notably in information retrieval (IR). Nevertheless, PLSI cannot process d ...
Results caching is an efficient technique for reducing the query processing load, hence it is commonly used in real search engines. This technique, however, bounds the maximum hit rate due to the large fraction of singleton queries, which is an important l ...
Ranking in scientific publication databases involves a variety of additional resources that are usually not applied in standard general purpose search engines. Moreover, community-specific expectations of users influence the perception of the adequacy of r ...
Recommender systems have emerged as an effective decision tool to help users more easily and quickly find products that they prefer, especially in e-commerce environments. However, few studies have tried to understand how this technology has influenced the ...
In this paper we present the AlvisP2P IR engine, which enables efficient retrieval with multi-keyword queries from a global document collection available in a P2P network. In such a network, each peer publishes its local index and invests a part of its loc ...