Behavior of a Bayesian adaptation method for incremental enrollment in speaker verification
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Speech recognition-based applications upon the advancements in artificial intelligence play an essential role to transform most aspects of modern life. However, speech recognition in real-life conditions (e.g., in the presence of overlapping speech, varyin ...
Automatic Speech Recognition (ASR), as the assistance of speech communication between pilots and air-traffic controllers, can significantly reduce the complexity of the task and increase the reliability of transmitted information. ASR application can lead ...
There is a growing recognition that electronic band structure is a local property of materials and devices, and there is steep growth in capabilities to collect the relevant data. New photon sources, from small-laboratory-based lasers to free electron lase ...
The task of Heterogeneous Face Recognition consists in matching face images that are sensed in different domains, such as sketches to photographs (visual spectra images), thermal images to photographs or near-infrared images to photographs.
In this work we ...
Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. In recent years, unsupervised and self-supervised techniques for learning speech representation were developed to foster automatic speech recognition. Up to ...
EUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP2021
Subword modeling for zero-resource languages aims to learn low-level representations of speech audio without using transcriptions or other resources from the target language (such as text corpora or pronunciation dictionaries). A good representation should ...
Feature extraction is a key step in many machine learning and signal processing applications. For speech signals in particular, it is important to derive features that contain both the vocal characteristics of the speaker and the content of the speech. In ...
Despite the significant progress in recent years, deep face recognition is often treated as a "black box" and has been criticized for lacking explainability. It becomes increasingly important to understand the characteristics and decisions of deep face rec ...
We propose a deep neural network based image-to-image translation for domain adaptation, which aims at finding translations between image domains. Despite recent GAN based methods showing promising results in image-to-image translation, they are prone to f ...
In the literature, the task of dysarthric speech intelligibility assessment has been approached through development of different low-level feature representations, subspace modeling, phone confidence estimation or measurement of automatic speech recognitio ...