Related publications (107)

Data-Driven Music Theory: Curating and Investigating Large Corpora of Digitally Encoded Music Analyses

Johannes Hentschel

This dissertation on data-driven music theory is centered around curatorial practices concerning the creation, publication, and evaluation of large, expert-annotated symbolic datasets. With its primary interest in the harmony of European tonal music from i ...
EPFL2024

Post-correction of Historical Text Transcripts with Large Language Models: An Exploratory Study

Frédéric Kaplan, Maud Ehrmann, Matteo Romanello, Sven-Nicolas Yoann Najem, Emanuela Boros

The quality of automatic transcription of heritage documents, whether from printed, manuscripts or audio sources, has a decisive impact on the ability to search and process historical texts. Although significant progress has been made in text recognition ( ...
The Association for Computational Linguistics2024

An Annotated Corpus of Tonal Piano Music from the Long 19th Century

Martin Alois Rohrmeier, Fabian Claude Moss, Markus Franz Josef Neuwirth, Johannes Hentschel

We present a dataset of 264 annotated piano pieces of nine composers, composed in the long 19th century (https://doi.org/10.5281/zenodo.7483349). Annotations adhere to the DCML harmony annotation standard and include Roman numerals, phrase boundaries, and ...
Ohio State Univ, Sch Music2023

Ce que les machines ont vu et que nous ne savons pas encore

Frédéric Kaplan, Isabella Di Lenardo

Cet article conceptualise l’idée qu’il existe une « matière noire » composée des structurations latentes identifiées par le regard machinique sur de grandes collections photographiques patrimoniales. Les campagnes photographiques de l’histoire de l’art, au ...
2023

Lausanne Historical Censuses Dataset HTR 35k

Lucas Arnaud André Rappo, Rémi Guillaume Petitpierre, Marion Kramer

This training dataset includes a total of 34,913 manually transcribed text segments. It is dedicated to the handwritten text recognition (HTR) of historical sources, typically tabular records, such as censuses. This dataset is based on a sample of 83 pages ...
Zenodo2023

From Archival Sources to Structured Historical Information: Annotating and Exploring the "Accordi dei Garzoni"

Frédéric Kaplan, Maud Ehrmann, Orlin Biserov Topalov

If automatic document processing techniques have achieved a certain maturity for present time documents, the transformation of hand-written documents into well-represented, structured and connected data which can satisfactorily be used for historical study ...
Routledge, Taylor & Francis Group2023

Bias at a Second Glance: A Deep Dive into Bias for German Educational Peer-Review Data Modeling

Vinitra Swamy, Thiemo Wambsganss

Natural Language Processing (NLP) has become increasingly utilized to provide adaptivity in educational applications. However, recent research has highlighted a variety of biases in pre-trained language models. While existing studies investigate bias in di ...
2022

SKILL: Structured Knowledge Infusion for Large Language Models

Martin Jaggi, Fedor Moiseev

Large language models (LLMs) have demonstrated human-level performance on a vast spectrum of natural language tasks. However, it is largely unexplored whether they can better internalize knowledge from a structured data, such as a knowledge graph, or from ...
ASSOC COMPUTATIONAL LINGUISTICS-ACL2022

Frédéric Chopin - Mazurkas (A corpus of annotated scores)

Martin Alois Rohrmeier, Markus Franz Josef Neuwirth, Johannes Hentschel

This corpus of annotated MuseScore files has been created within the DCML corpus initiative and employs the DCML harmony annotation standard. It is one out of nine similar corpora that have been grouped together to An Annotated Corpus of Tonal Piano Music ...
Zenodo2022

Claude Debussy - Suite Bergamasque (A corpus of annotated scores)

Martin Alois Rohrmeier, Markus Franz Josef Neuwirth, Johannes Hentschel

This corpus of annotated MuseScore files has been created within the DCML corpus initiative and employs the DCML harmony annotation standard. It is one out of nine similar corpora that have been grouped together to An Annotated Corpus of Tonal Piano Music ...
Zenodo2022

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.