Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
As the number of scientific papers getting published is likely to soar, most of modern paper management systems (e.g. ScienceWise, Mendeley, CiteULike) support tag-based retrieval. In that, each paper is associated with a set of \emph{tags}, allowing user to search for relevant papers by formulating tag-based queries against the system. One of the most critical issues in tag-based retrieval is that user often has difficulties in precisely formulating his information need. Addressing this issue, our paper tackles the problem of automatically suggesting new tags for user when he formulates a query. The set of tags are selected in such a way that resolves query ambiguity in two aspects: \emph{informativeness} and \emph{diversity}. While the former reduces user effort in finding the desired papers, the latter enhances the variety of information shown to user. Through studying theoretical properties of this problem, we propose a heuristic-based algorithm with several salient performance guarantees. We also demonstrate the efficiency of our approach through extensive experimentation using real-world datasets.
Cédric Duchene, Nicolas Henchoz, Emily Clare Groves, Romain Simon Collaud, Andreas Sonderegger, Yoann Pierre Douillet
Michael Christoph Gastpar, Guillaume Jean Op 't Veld
Robert West, Tiziano Piccardi, Giovanni Colavizza