Improving speech embedding using crossmodal transfer learning with audio-visual data
Publications associées (41)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Short-term field study involves groups of students working in an off-campus (sometimes international) setting, and often involves working on realistic, open-ended problems, in interaction with a host community. Such learning experiences are intended to dev ...
Through profiling and matching processes, technology provides individuals with information that becomes redundant to their previous beliefs, attitudes and preferences. The emergence of informational redundancies encouraged by some technologies is likely to ...
Neuromorphic systems provide brain-inspired methods of computing. In a neuromorphic architecture, inputs are processed by a network of neurons receiving operands through synaptic interconnections, tuned in the process of learning. Neurons act simultaneousl ...
Multimedia databases are growing rapidly in size in the digital age. To increase the value of these data and to enhance the user experience, there is a need to make these videos searchable through automatic indexing. Because people appearing and talking in ...
Collaborative learning flow patterns (CLFPs) encode solutions to recurrent pedagogical problems, which have been successfully applied to the design of learning experiences. However, the pedagogical knowledge encoded in these patterns has seldom been exploi ...
Modeling and predicting student learning is an important task in computer-based education. A large body of work has focused on representing and predicting student knowledge accurately. Existing techniques are mostly based on students' performance and on ti ...
Learning speaker turn embeddings has shown considerable improvement in situations where conventional speaker modeling approaches fail. However, this improvement is relatively limited when compared to the gain observed in face embedding learning, which has ...
This paper proposes a novel approach to improve speaker modeling using knowledge transferred from face representation. In particular, we are interested in learning a discriminative metric which allows speaker turns to be compared directly, which is benefic ...
Text autoencoders are commonly used for conditional generation tasks such as style transfer. We propose methods which are plug and play, where any pretrained autoencoder can be used, and only require learning a mapping within the autoencoder's embedding sp ...
Empirical studies document a positive effect of collaboration on team productivity. However, little has been done to assess how knowledge flows among team members. Our study addresses this issue by exploring unique rich data on a Swiss funding program prom ...