On Learning to Identify Genders from Raw Speech Signal Using CNNs
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a rapid adaptation technique for statistical parametric speech synthesis. VTLN produces speech with naturalness preferable to that of MLLR- based adaptation te ...
In this thesis, we propose a novel approach for speaker and speech recognition involving localized, binary, data-driven features. The proposed approach is largely inspired by similar localized approaches in the computer vision domain. The success of these ...
The movement of an organism typically provides an observer with information in more than one sensory modality. The integration of information modalities reduces the likelihood that the observer will be confronted with a scene that is perceptually ambiguous ...
Recently, several multi-layer perceptron (MLP)- based front-ends have been developed and used for Mandarin speech recognition, often showing significant complementary properties to conventional spectral features. Although widely used in multiple Mandarin s ...
In this paper, we propose a novel parts-based binary-valued feature for ASR. This feature is extracted using boosted ensembles of simple threshold-based classifiers. Each such classifier looks at a specific pair of time-frequency bins located on the spectr ...
In this thesis, we investigate a hierarchical approach for estimating the phonetic class-conditional probabilities using a multilayer perceptron (MLP) neural network. The architecture consists of two MLP classifiers in cascade. The first MLP is trained in ...
In this thesis, we propose a novel approach for speaker and speech recognition involving localized, binary, data-driven features. The proposed approach is largely inspired by similar localized approaches in the computer vision domain. The success of these ...
Ecole Polytechnique Federale de Lausanne (EPFL)2011
A novel parts-based binary-valued feature termed Boosted Binary Feature (BBF) was recently proposed for ASR. Such features look at specific pairs of time-frequency bins in the spectro-temporal plane. The most discriminative of these features are selected b ...
Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a rapid adaptation technique for statistical parametric speech synthesis. VTLN produces speech with naturalness preferable to that of MLLR based adaptation tec ...
We apply multilayer perceptron (MLP) based hierarchical Tandem features to large vocabulary continuous speech recognition in Mandarin. Hierarchical Tandem features are estimated using a cascade of two MLP classifiers which are trained independently. The fi ...