Lattice-Free Mmi Adaptation Of Self-Supervised Pretrained Acoustic Models
Related publications (32)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
In this paper, we propose a novel Deep Micro-Dictionary Learning and Coding Network (DDLCN). DDLCN has most of the standard deep learning layers (pooling, fully, connected, input/output, etc.) but the main difference is that the fundamental convolutional l ...
Thanks to the digital preservation of cultural heritage materials, multimedia tools (e.g., based on automatic visual processing) considerably ease the work of scholars in the humanities and help them to perform quantitative analysis of their data. In this ...
Background: The discovery of the CRISPR-Cas9-based gene editing method has opened unprecedented new potential for biological and medical engineering, sparking a growing public debate on both the potential and dangers of CRISPR applications. Given the speed ...
Natural language processing techniques are dependent upon punctuation to work well. When their input is taken from speech recognition, it is necessary to reconstruct the punctuation; in particular sentence boundaries. We define a range of features from low ...
Clinical applications, such as image-guided surgery and noninvasive diagnosis, rely heavily on multi-modal images. Medical image fusion plays a central role by integrating information from multiple sources into a single, more understandable output. We prop ...
Estimating the 3D poses of rigid and articulated bodies is one of the fundamental problems of Computer Vision. It has a broad range of applications including augmented reality, surveillance, animation and human-computer interaction. Despite the ever-growin ...
This study deals with semantic segmentation of high-resolution (aerial) images where a semantic class label is assigned to each pixel via supervised classification as a basis for automatic map generation. Recently, deep convolutional neural networks (CNNs) ...
Institute of Electrical and Electronics Engineers2017
In this paper, we explore various approaches for semi-
supervised learning in an end-to-end automatic speech recog-
nition (ASR) framework. The first step in our approach in-
volves training a seed model on the limited amount of labelled
data. Additional u ...
Modern 3D human pose estimation techniques rely on deep networks, which require large amounts of training data. While weakly-supervised methods require less supervision, by utilizing 2D poses or multi-view imagery without annotations, they still need a suf ...
Natural language processing techniques are dependent upon punctuation to work well. When their input is taken from speech recognition, it is necessary to reconstruct the punctuation; in particular sentence boundaries. We define a range of features from low ...