Publication

Two-level bimodal association for audio-visual speech recognition

Touradj Ebrahimi, Jong Seok Lee
2009
Conference paper

Abstract

This paper proposes a new method for bimodal information fusion in audio-visual speech recognition, where cross-modal association is considered in two levels. First, the acoustic and the visual data streams are combined at the feature level by using the canonical correlation analysis, which deals with the problems of audio-visual synchronization and utilizing the cross-modal correlation. Second, information streams are integrated at the decision level for adaptive fusion of the streams according to the noise condition of the given speech datum. Experimental results demonstrate that the proposed method is effective for producing noise-robust recognition performance without a priori knowledge about the noise conditions of the speech data.

Official source

https://infoscience.epfl.ch/record/139227?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Ontological neighbourhood

Mechanical engineering

Acoustical engineering: Architectural acoustics

Related concepts (32)

Related publications (45)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Two-level bimodal association for audio-visual speech recognition

Graph Chatbot

Chat with Graph Search

Localized model order reduction and domain decomposition methods for coupled heterogeneous systems

Plasma-based Electroacoustic Actuator for Broadband Sound Absorption

Transverse Noise, Decoherence, and Landau Damping in High-Energy Hadron Colliders

Plasma-based Electroacoustic Actuator for Broadband Sound Absorption

Localized model order reduction and domain decomposition methods for coupled heterogeneous systems

Transverse Noise, Decoherence, and Landau Damping in High-Energy Hadron Colliders