Publication

Deep Learning Approaches for Auditory Perception in Robotics

Weipeng He
2021
EPFL thesis

Abstract

Auditory perception is an essential part of a robotic system in Human-Robot Interaction (HRI), and creating an artificial auditory perception system that is on par with human has been a long-standing goal for researchers. In fact, this is a challenging research topic, because in typical HRI scenarios the audio signal is often corrupted by the robot ego noise, other background noise and overlapping voices. The traditional approaches based on signal processing seek analytical solutions according to the physical law of sound propagation as well as assumptions about the signal, noise and environments. However, such approaches either assume over-simplified conditions, or create sophisticated models that do not generalize well in real situations.

This thesis introduces an alternative methodology to auditory perception in robotics by using deep learning techniques. It includes a group of novel deep learning-based approaches addressing sound source localization, speech/non-speech classification, and speaker re-identification. The deep learning-based approaches rely on neural network models that learn directly from the data without making many assumptions. They are shown by experiments with real robots to outperform the traditional methods in complex environments, where there are multiple speakers, interfering noises and no a priori knowledge about the number of sources.

In addition, this thesis addresses the issue of high cost of data collection which arises with learning-based approaches. Domain adaptation and data augmentation methods are proposed to exploit simulated data and weakly-labeled real data, so that the effort for data collection is minimized. Overall, this thesis suggests a practical and robust solution for auditory perception in robotics in the wild.

Official source

https://infoscience.epfl.ch/record/283940?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Weipeng He
2021
EPFL thesis

Abstract

Official source

https://infoscience.epfl.ch/record/283940?ln=en

About this result

Ontological neighbourhood

Information engineering

Machine learning: Artificial neural networks

Mechanical engineering

Robotics: Topics in robotics

Related concepts (36)

Related publications (213)

Related MOOCs (32)

Deep Learning Approaches for Auditory Perception in Robotics

Graph Chatbot

Chat with Graph Search

Robust machine learning for neuroscientific inference

Investigating neural resource allocation in the sensorimotor control of extra limbs

Generalization and Personalization of Machine Learning for Multimodal Mobile Sensing in Everyday Life

Robust machine learning for neuroscientific inference

Generalization and Personalization of Machine Learning for Multimodal Mobile Sensing in Everyday Life

Investigating neural resource allocation in the sensorimotor control of extra limbs