Cross-validation is commonly used to select the recommendation algorithms that will generalize best on yet unknown data. Yet, in many situations the available dataset used for cross-validation is scarce and the selected algorithm might not be the best suited for the unknown data. In contrast, established companies have a large amount of data available to select and tune their recommender algorithms, which therefore should generalize better. These companies often make their recommender systems available as black-boxes, i.e., users query the recommender through an API or a browser. This paper proposes RecRank, a technique that exploits a black-box recommender system, in addition to classic cross-validation. RecRank employs graph similarity measures to compute a distance between the output recommendations of the black-box and of the considered algorithms. We empirically show that RecRank provides a substantial improvement (33%) for the selection of algorithms for the MovieLens dataset, in comparison with standalone cross-validation.
Frédéric Courbin, Georges Meylan, Gianluca Castignani, Maurizio Martinelli, Matthias Wiesmann, Yi Wang, Richard Massey, Fabio Finelli, Marcello Farina
David Atienza Alonso, Alireza Amirshahi, Jonathan Dan, Adriano Bernini, William Cappelletti, Luca Benini, Una Pale
Nikolaos Geroliminis, Claudia Bongiovanni, Mor Kaspi