Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build ...
Musical source separation is a complex topic that has been extensively explored in the signal processing community and has benefited greatly from recent machine learning research. Many deep learning models with impressive source separation quality have bee ...
This report provides an overview of the work carried out in improving Language Model (LM) development used during the decoding of an Automatic Speech Recognition (ASR) system. The goal of this work is to develop a robust language model that can be adapted ...
Human motion prediction, the task of predicting future 3D human poses given a sequence of observed ones, has been mostly treated as a deterministic problem. However, human motion is a stochastic process: Given an observed sequence of poses, multiple future ...
Speech Emotion Recognition (SER) has been shown to benefit from many of the recent advances in deep learning, including recurrent based and attention based neural network architectures as well. Nevertheless, performance still falls short of that of humans. ...
Language independent query-by-example spoken term detection (QbE-STD) is the problem of retrieving audio documents from an archive, which contain a spoken query provided by a user. This is usually casted as a hypothesis testing and pattern matching problem ...
Increasing concerns with privacy have stimulated interests in Session-based Recommendation (SR) using no personal data other than what is observed in the current browser session. Existing methods are evaluated in static settings which rarely occur in real- ...
We experiment with subword segmentation approaches that are widely used to address the open vocabulary problem in the context of end-to-end automatic speech recognition (ASR). For morphologically rich languages such as German which has many rare words main ...
Transformers have been proven a successful model for a variety of tasks in sequence modeling. However, computing the attention matrix, which is their key component, has quadratic complexity with respect to the sequence length, thus making them prohibitivel ...
Recent trends of incorporating attention mechanisms in vision have led re- searchers to reconsider the supremacy of convolutional layers as a primary build- ing block. Beyond helping CNNs to handle long-range dependencies, Ramachandran et al. (2019) showed ...