Introducing Temporal Asymmetries in Feature Extraction for Automatic Speech Recognition
Related publications (32)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Many pathologies cause impairments in the speech production mechanism resulting in reduced speech intelligibility and communicative ability. To assist the clinical diagnosis, treatment and management of speech disorders, automatic pathological speech asses ...
Voice activity detection (VAD) is an important pre-processing step for speech technology applications. The task consists of deriving segment boundaries of audio signals which contain voicing information. In recent years, it has been shown that voice source ...
In wearable-based human activity recognition (HAR) research, one of the major challenges is the large intra-class variability problem. The collected activity signal is often, if not always, coupled with noises or bias caused by personal, environmental, or ...
In this work, we present a simple biometric indexing scheme which is binning and retrieving cancelable deep face templates based on frequent binary patterns. The simplicity of the proposed approach makes it applicable to unprotected as well as protected, i ...
Technological progress in materials science and microengineering along with new discoveries in neuroscience have contributed to restore lost or impaired sensory functions by closely interfacing with the nervous system. Electronic devices have begun to be i ...
We introduce HP, an implementation of density-functional perturbation theory, designed to compute Hubbard parameters (on-site U and inter-site V ) in the framework of DFT+U and DFT+U+V. The code does not require the use of computationally expensive superce ...
While public speech resources become increasingly available, there is a growing interest to preserve the privacy of the speakers, through methods that anonymize the speaker information from speech while preserving the spoken linguistic content. In this pap ...
A remote microphone (RM) system can be used in combination with wearable binaural communication devices, such as hearing aids (HAs), to improve speech intelligibility. Typically, a speaker is equipped with a body-worn microphone which enables to pick up th ...
The respiratory system is an integral part of human speech production. As a consequence, there is a close relation between respiration and speech signal, and the produced speech signal carries breathing pattern related information. Speech can also be gener ...
Sentiment analysis is the automated coding of emotions expressed in text. Sentiment analysis and other types of analyses focusing on the automatic coding of textual documents are increasingly popular in psychology and computer science. However, the potenti ...