The segmentation of multi-channel meeting recordings for automatic speech recognition
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
In this PhD manuscript, we explore optimisation phenomena which occur in complex neural networks through the lens of 2-layer diagonal linear networks. This rudimentary architecture, which consists of a two layer feedforward linear network with a diagonal ...
Photometric stereo, a computer vision technique for estimating the 3D shape of objects through images captured under varying illumination conditions, has been a topic of research for nearly four decades. In its general formulation, photometric stereo is an ...
Deep learning has revolutionized the field of computer vision, a success largely attributable to the growing size of models, datasets, and computational power.Simultaneously, a critical pain point arises as several computer vision applications are deployed ...
Auditory research aims in general to lead to understanding of physiological processes. By contrast, the state of the art in automatic speech processing (notably recognition) is dominated by large pre-trained models that are meant to be used as black-boxes. ...
Recent developments in neural architecture search (NAS) emphasize the significance of considering robust architectures against malicious data. However, there is a notable absence of benchmark evaluations and theoretical guarantees for searching these robus ...
The ability to reason, plan and solve highly abstract problems is a hallmark of human intelligence. Recent advancements in artificial intelligence, propelled by deep neural networks, have revolutionized disciplines like computer vision and natural language ...
Topographical disorientation refers to the selective inability to orient oneself in familiar surroundings. However, to date its neural correlates remain poorly understood. Here we use quantitative lesion analysis and a lesion network mapping approach in or ...
Throughout history, the pace of knowledge and information sharing has evolved into an unthinkable speed and media. At the end of the XVII century, in Europe, the ideas that would shape the "Age of Enlightenment" were slowly being developed in coffeehouses, ...
Fluorescence lifetime imaging (FLI) has been receiving increased attention in recent years as a powerful diagnostic technique in biological and medical research. However, existing FLI systems often suffer from a tradeoff between processing speed, accuracy, ...
Recent years have witnessed significant advance- ment in face recognition (FR) techniques, with their applications widely spread in people’s lives and security-sensitive areas. There is a growing need for reliable interpretations of decisions of such syste ...