Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Learning and compression are driven by the common aim of identifying and exploiting statistical regularities in data, which opens the door for fertile collaboration between these areas. A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding, which provides strong guarantees for compression of small datasets - in contrast with more popular estimators whose guarantees hold only in the asymptotic limit. Here we consider a NML-based decision strategy for supervised classification problems, and show that it attains heuristic PAC learning when applied to a wide variety of models. Furthermore, we show that the misclassification rate of our method is upper bounded by the maximal leakage, a recently proposed metric to quantify the potential of data leakage in privacy-sensitive scenarios.
Daniele Mari, Lucas Eric Jacques Degeneve
Touradj Ebrahimi, Michela Testolina, Davi Nachtigall Lazzarotto