Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This article describes a new expert-labelled dataset featuring harmonic, phrase, and cadence analyses of all piano sonatas by W.A. Mozart. The dataset draws on the DCML standard for harmonic annotation and is being published adopting the FAIR principles of Open Science. The annotations have been verified using a data triangulation procedure which is presented as an alternative approach to handling annotator subjectivity. This procedure is suited for ensuring consistency, within the dataset and beyond, despite the high level of analytical detail afforded by the employed harmonic annotation syntax. The harmony labels also encode contextual information and are therefore suited for investigating music theoretical questions related to tonal harmony and the harmonic makeup of cadences in the classical style. Apart from providing basic statistical analyses characterizing the dataset, its music theoretical potential is illustrated by two preliminary experiments, one on the terminal harmonies of cadences and the other on the relation between performance durations and harmonic density. Furthermore, particular features can be selected to produce more coarse-grained training data, for example for chord detection algorithms that require less analytical detail. Facilitating the dataset’s reusability, it comes with a Python script that allows researchers to easily access various representations of the data tailored to their particular needs.
Martin Alois Rohrmeier, Johannes Hentschel, Gabriele Cecchetti, Sabrina Laneve, Ludovica Schaerf
Martin Alois Rohrmeier, Fabian Claude Moss, Robert Lieck