Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
With the increasing amount of video being consumed by people daily, there is a danger of the rise in maliciously modified video content (i.e., 'fake news') that could be used to damage innocent people or to impose a certain agenda, e.g., meddle in election ...
This paper addresses the problem of detecting speech utterances from a large audio archive using a simple spoken query, hence referring to this problem as "Query by Example Spoken Term Detection" (QbE-STD). This still open pattern matching problem has been ...
In this work, we address the problem of query by example spoken term detection (QbE-STD) in zero-resource scenario. State of the art solutions usually rely on dynamic time warping (DTW) based template matching. In contrast, we propose here to tackle the pr ...
Air Navigation Service Provider (ANSPs) replace paper flight strips through different digital solutions. The instructed commands from an air traffic controller (ATCOs) are then available in computer readable form. However, those systems require manual cont ...
Polarimetric imaging techniques demonstrate enhanced capabilities in advanced object detection tasks with their capability to discriminate man-made objects from natural background surfaces. While spectral signatures carry information only about material pr ...
This paper explores novel ideas in building end-to-end deep neural network (DNN) based text-dependent speaker verification (SV) system. The baseline approach consists of mapping a variable length speech segment to a fixed dimensional speaker vector by esti ...
We propose a head pose estimation framework that leverages on a recent keypoint detection model. More specifically, we apply the convolutional pose machines (CPMs) to input images, extract different types of facial keypoint features capturing appearance in ...
In this paper, we are interested in exploring Deep Neural Network (DNN) based speaker embedding for Random-digit task using content information. To this end, a technique is applied to automatically select common phonetic units between the enrollment and te ...
State of the art query by example spoken term detection (QbE-STD) systems in zero-resource conditions rely on representation of speech in terms of sequences of class-conditional posterior probabilities estimated by deep neural network (DNN). The posteriors ...
This work tests several classification techniques and acoustic features and further combines them using late fusion to classify paralinguistic information for the ComParE 2018 challenge. We use Multiple Linear Regression (MLR) with Ordinary Least Squares ( ...