Pose Transformers (POTR): Human Motion Prediction with Non-Autoregressive Transformers
Related publications (37)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
For a long time, natural language processing (NLP) has relied on generative models with task specific and manually engineered features. Recently, there has been a resurgence of interest for neural networks in the machine learning community, obtaining state ...
In this paper we propose a new technique for robust keyword spotting that uses bidirectional Long Short-Term Memory (BLSTM) recurrent neural nets to incorporate contextual information in speech decoding. Our approach overcomes the drawbacks of generative H ...
Different training and adaptation techniques for multilingual Automatic Speech Recognition (ASR) are explored in the context of hybrid systems, exploiting Deep Neural Networks (DNN) and Hidden Markov Models (HMM). In multilingual DNN training, the hidden l ...
Second-order pooling, a.k.a. bilinear pooling, has proven effective for deep learning based visual recognition. However, the resulting second-order networks yield a final representation that is orders of magnitude larger than that of standard, first-order ...
Pedestrians follow different trajectories to avoid obstacles and accommodate fellow pedestrians. Any autonomous vehicle navigating such a scene should be able to foresee the future positions of pedestrians and accordingly adjust its path to avoid collision ...
Automatic visual speech recognition is an interesting problem in pattern recognition especially when audio data is noisy or not readily available. It is also a very challenging task mainly because of the lower amount of information in the visual articulati ...
Humans are able to learn and compose complex, yet beautiful, pieces of music as seen in e.g. the highly complicated works of J.S. Bach. However, how our brain is able to store and produce these very long temporal sequences is still an open question. Long s ...