Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
Supervised and unsupervised kernel-based algorithms widely used in the physical sciences depend upon the notion of similarity. Their reliance on pre-defined distance metrics-e.g. the Euclidean or Manhattan distance-are problematic especially when used in combination with high-dimensional feature vectors for which the similarity measure does not well-reflect the differences in the target property. Metric learning is an elegant approach to surmount this shortcoming and find a property-informed transformation of the feature space. We propose a new algorithm for metric learning specifically adapted for kernel ridge regression (KRR): metric learning for kernel ridge regression (MLKRR). It is based on the Metric Learning for Kernel Regression framework using the Nadaraya-Watson estimator, which we show to be inferior to the KRR estimator for typical physics-based machine learning tasks. The MLKRR algorithm allows for superior predictive performance on the benchmark regression task of atomisation energies of QM9 molecules, as well as generating more meaningful low-dimensional projections of the modified feature space.
Florent Gérard Krzakala, Lenka Zdeborová, Hugo Chao Cui
Michele Ceriotti, Alberto Fabrizio, Benjamin André René Meyer, Edgar Albert Engel, Raimon Fabregat I De Aguilar-Amat, Veronika Juraskova