Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases tha ...
In light of steady progress in machine learning, automatic speech recognition (ASR) is entering more and more areas of our daily life, but people with dysarthria and other speech pathologies are left behind. Their voices are underrepresented in the trainin ...
Despite the significant progress in recent years, deep face recognition is often treated as a "black box" and has been criticized for lacking explainability. It becomes increasingly important to understand the characteristics and decisions of deep face rec ...
State-of-the-art face recognition systems require vast amounts of labeled training data. Given the priority of privacy in face recognition applications, the data is limited to celebrity web crawls, which have issues such as limited numbers of identities. O ...
As an 'early alerting' sense, one of the primary tasks for the human visual system is to recognize distant objects. In the specific context of facial identification, this ecologically important task has received surprisingly little attention. Most studies ...
We introduce a new class of succinct arguments, that we call elastic. Elastic SNARKs allow the prover to allocate different resources (such as memory and time) depending on the execution environment and the statement to prove. The resulting output is indep ...
Personalized ranking methods are at the core of many systems that learn to produce recommendations from user feedbacks. Their primary objective is to identify relevant items from very large vocabularies and to assist users in discovering new content. These ...
Voice activity detection (VAD) is an important pre-processing step for speech technology applications. The task consists of deriving segment boundaries of audio signals which contain voicing information. In recent years, it has been shown that voice source ...
Speech recognition-based applications upon the advancements in artificial intelligence play an essential role to transform most aspects of modern life. However, speech recognition in real-life conditions (e.g., in the presence of overlapping speech, varyin ...
In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI)-based tools. The virtual simulation-pilot engine receives spoken ...