Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Tracking vocal tract formant frequencies (fp) and estimating the fundamental frequency (f0) are two tracking problems that have been tackled in many speech processing works, often independently, with applications to articulatory parameters estimation ...
Institute of Electrical and Electronics Engineers2013
, ,
Phonological features extracted by neural network have shown interesting potential for low bit rate speech vocoding. The span of phonological features is wider than the span of phonetic features, and thus fewer frames need to be transmitted. Moreover, the ...
2015
In this thesis, methods and models are developed and presented aiming at the estimation, restoration and transformation of the characteristics of human speech. During a first period of the thesis, a concept was developed that allows restoring prosodic voic ...
EPFL2012
,
The log-energy parameter, typically derived from a full-band spectrum, is a critical feature commonly used in automatic speech recognition (ASR) systems. However, log-energy is difficult to estimate reliably in the presence of background noise. In this pap ...
In this paper, we propose a platform based on phonological speech vocoding for examining relations between phonology and speech processing, and in broader terms, between the abstract and physical structures of speech signal. The goal of this paper is to go ...
In this paper, we investigate pitch contour modelling in speech synthesis based on segmental units. A convolutional pitch target approximation model is proposed. This model allows jointly stochastic modelling of framewise pitch and pitch contour of longer ...
Frequency Domain Linear Prediction (FDLP) represents a technique for auto-regressive modelling of Hilbert envelopes of a signal. In this paper, we propose a speech coding technique that uses FDLP in Quadrature Mirror Filter (QMF) sub-bands of short segment ...
ECMA-407, the first 3D audio standard worldwide, introduces a new concept of static models to lower bitrate coding, which may be equally applied with channels, channels and objects and Higher Order Ambisonics (HOA). Static models may either operate in time ...
In this work, we present a joint source-filter optimization approach for separating voiced speech into vocal tract (VT) and voice source components. The presented method is pitch-synchronous and thereby exhibits a high robustness against vocal jitter, shim ...
The goal of this paper is to investigate features for speech/nonspeech detection (SND) having ``minimal'' linguistic information from the speech signal. Towards this, we present a comprehensive study of privacy-sensitive features for SND in multiparty conv ...