An End-to-End Networks to Synthetize Intonation using a Generalized Command Response Model

The generalized command response (GCR) model represents intonation as a superposition of muscle responses to spike command signals. We have previously shown that the spikes can be predicted by a two-stage system, consisting of a recurrent neural network and a post-processing procedure, but the responses themselves were fixed dictionary atoms. We propose an end-to-end neural architecture that replaces the dictionary atoms with trainable second-order recurrent elements analogous to recursive filters. We demonstrate gradient stability under modest conditions, and show that the system can be trained by imposing temporal sparsity constraints. Subjective listening tests demonstrate that the system can synthesize intonation with high naturalness, comparable to state-of-the-art acoustic models, and retains the physiological plausibility of the GCR model.

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

An End-to-End Networks to Synthetize Intonation using a Generalized Command Response Model

Graph Chatbot

Chat with Graph Search

Deep Learning Generalization with Limited and Noisy Labels

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

A multichannel versatile brain activity classification and closed loop neuromodulation system, device and method using a highly multiplexed mixed-signal front-end

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

Deep Learning Generalization with Limited and Noisy Labels

A multichannel versatile brain activity classification and closed loop neuromodulation system, device and method using a highly multiplexed mixed-signal front-end