Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
Information Retrieval (IR) aims at solving a ranking problem: given a query and a corpus , the documents of should be ranked such that the documents relevant to appear above the others. This task is generally performed by ranking the documents according to their similarity with respect to , . The identification of an effective function could be performed using a large set of queries with their corresponding relevance assessments. However, such data are especially expensive to label, thus, as an alternative, we propose to rely on hyperlink data which convey analogous semantic relationships. We then empirically show that a measure inferred from hyperlinked documents can actually outperform the state-of-the-art {\em Okapi} approach, when applied over a non-hyperlinked retrieval corpus.
Frédéric Kaplan, Vincent Christian Buntinx, Cyril Antoine Michel Bornet
François Fleuret, Nicolae Suditu