Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
In this paper we present Aligned Scores and Performances (ASAP): a new dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.The scores are provided as paired MusicXML files and quantized ...
In this paper, we introduce our recent studies on human perception in audio event classification. In particular, the pre-trained model VGGish is used as feature extractor to process audio data, and DenseNet is trained by and used as feature extractor for o ...
As high readout channel density and compact design become the norm for HEP detectors so is the detector cooling at temperatures below the experimental site dewpoint. This increases the importance of humidity and temperature monitoring systems that are also ...
We investigate the design of a convolutional layer where kernels are parameterized functions. This layer aims at being the input layer of convolutional neural networks for audio applications or applications involving time-series. The kernels are defined as ...
A system for subpixel resolution imaging of an amplitude and quantitative phase image, the system including a waveguide having a top plane, a bottom plane, and two sides, an array of light sources emitting first befit beams from one side of the two sides o ...
ECMA-407, the first 3D audio standard worldwide, introduces a new concept of static models to lower bitrate coding, which may be equally applied with channels, channels and objects and Higher Order Ambisonics (HOA). Static models may either operate in time ...
This report presents a semi-supervised method to jointly extract audio-visual sources from a scene. It consist of applying a supervised method to segment the video signal followed by an automatic process to properly separate the audio track. This approach ...
This paper addresses the multimodal nature of social dominance and presents multimodal fusion techniques to combine audio and visual nonverbal cues for dominance estimation in small group conversations. We combine the two modalities both at the feature ext ...
This paper addresses the multimodal nature of social dominance and presents multimodal fusion techniques to combine audio and visual nonverbal cues for dominance estimation in small group conversations. We combine the two modalities both at the feature ext ...
This paper introduces high-quality audio coding using psychoacoustic models. This technology is now abundant, with gadgets named after a standard (mp3 players) and the ability to play high-quality audio from literally billions of devices. The usual paradig ...