Publication

A Study on More Realistic Room Simulation for Far-Field Keyword Spotting

Eric Bezzam, Robin Scheibler
2020
Conference paper
Abstract

We investigate the impact of more realistic room simulation for training far-field keyword spotting systems without fine-tuning on in-domain data. To this end, we study the impact of incorporating the following factors in the room impulse response (RIR) generation: air absorption, surface- and frequency-dependent coefficients of real materials, and stochastic ray tracing. Through an ablation study, a wake word task is used to measure the impact of these factors in comparison with a ground-truth set of measured RIRs. On a hold-out set of re-recordings under clean and noisy far-field conditions, we demonstrate up to 35.8% relative improvement over the commonly-used (single absorption coefficient) image source method. Source code is made available in the Pyroomacoustics package, allowing others to incorporate these techniques in their work.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (25)
Source code
In computing, source code, or simply code, is any collection of text, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source code. The source code is often transformed by an assembler or compiler into binary machine code that can be executed by the computer.
Distributed ray tracing
Distributed ray tracing, also called distribution ray tracing and stochastic ray tracing, is a refinement of ray tracing that allows for the rendering of "soft" phenomena. Conventional ray tracing uses single rays to sample many different domains. For example, when the color of an object is calculated, ray tracing might send a single ray to each light source in the scene. This leads to sharp shadows, since there is no way for a light source to be partially occluded (another way of saying this is that all lights are point sources and have zero area).
Catheter ablation
Catheter ablation is a procedure that uses radio-frequency energy or other sources to terminate or modify a faulty electrical pathway from sections of the heart of those who are prone to developing cardiac arrhythmias such as atrial fibrillation, atrial flutter and Wolff-Parkinson-White syndrome. If not controlled, such arrhythmias increase the risk of ventricular fibrillation and sudden cardiac arrest. The ablation procedure can be classified by energy source: radiofrequency ablation and cryoablation.
Show more
Related publications (33)

SimMMDG: A Simple and Effective Framework for Multi-modal Domain Generalization

Olga Fink, Ismail Nejjar, Han Sun, Hao Dong

In real-world scenarios, achieving domain generalization (DG) presents significant challenges as models are required to generalize to unknown target distributions. Generalizing to unseen multi-modal distributions poses even greater difficulties due to the ...
2023

Automated Verification of Network Function Binaries

George Candea, Solal Vincenzo Pirelli

Formally verifying the correctness of software network functions (NFs) is necessary for network reliability, yet existing techniques require full source code and mandate the use of specific data structures. We describe an automated technique to verify NF b ...
USENIX Association2022

Templates for 3D Object Pose Estimation Revisited: Generalization to New Objects and Robustness to Occlusions

Mathieu Salzmann, Vincent Lepetit, Yinlin Hu, Van Nguyen Nguyen, Yang Xiao

We present a method that can recognize new objects and estimate their 3D pose in RGB images even under partial occlusions. Our method requires neither a training phase on these objects nor real images depicting them, only their CAD models. It relies on a s ...
IEEE COMPUTER SOC2022
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.