We consider here how to assess if two classifiers, based on a set of test error results, are performing equally well. This question is often considered in the realm of sampling theory, based on classical hypothesis testing. Here we present a simple Bayesian treatment that is quite general, and also is able to deal with the (practically common) case where the errors that two classifiers make are dependent.
Francesco Varrato, Antoine Louis Claude Masson, Eliane Ninfa Blumer, Sitthida Samath, Fantin Reichler, Jérôme Julien Chaptinel