Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Cough audio signal classification has been successfully used to diagnose a variety of respiratory conditions, and there has been significant interest in leveraging Machine Learning (ML) to provide widespread COVID-19 screening. The COUGHVID dataset provides over 25,000 crowdsourced cough recordings representing a wide range of participant ages, genders, geographic locations, and COVID-19 statuses. First, we contribute our open-sourced cough detection algorithm to the research community to assist in data robustness assessment. Second, four experienced physicians labeled more than 2,800 recordings to diagnose medical abnormalities present in the coughs, thereby contributing one of the largest expert-labeled cough datasets in existence that can be used for a plethora of cough audio classification tasks. Finally, we ensured that coughs labeled as symptomatic and COVID-19 originate from countries with high infection rates. As a result, the COUGHVID dataset contributes a wealth of cough recordings for training ML models to address the world’s most urgent health crises.
David Atienza Alonso, Tomas Teijeiro Campo, Lara Orlandic
Athanasios Nenes, Tamar Kohn, Kalliopi Violaki, Ghislain Gilles Jean-Michel Motos, Aline Laetitia Schaub, Shannon Christa David, Walter Hugentobler, Htet Kyi Wynn, Céline Terrettaz, Laura José Costa Henriques, Daniel Scott Nolan, Marta Augugliaro
Martin Jaggi, Mary-Anne Hartley, Juliane Dervaux, Tatjana Chavdarova, Daniel Mueller, Julien Niklas Heitmann, Daniel Hinjos García, Alexandre Perez