Publication

Linear Complexity Self-Attention With 3rd Order Polynomials

Related publications (38)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Estimating and Improving the Robustness of Attributions in Text

Ádám Dániel Ivánkay

End-to-end learning methods like deep neural networks have been the driving force in the remarkable progress of machine learning in recent years. However, despite their success, the deployment process of such networks in safety-critical use cases, such as ...

EPFL2023

Transformer Models for Vision

Jean-Baptiste Francis Marie Juliette Cordonnier

The recent developments of deep learning cover a wide variety of tasks such as image classification, text translation, playing go, and folding proteins.All these successful methods depend on a gradient-based learning algorithm to train a model on massive a ...

EPFL2023

From Kernel Methods to Neural Networks: A Unifying Variational Formulation

Michaël Unser

The minimization of a data-fidelity term and an additive regularization functional gives rise to a powerful framework for supervised learning. In this paper, we present a unifying regularization functional that depends on an operator L\documentclass[12pt]{ ...

New York2023

Vision Transformer Adapters for Generalizable Multitask Learning

Sabine Süsstrunk, Mathieu Salzmann, Deblina Bhattacharjee

We introduce the first multitasking vision transformer adapters that learn generalizable task affinities which can be applied to novel tasks and domains. Integrated into an off-the-shelf vision transformer backbone, our adapters can simultaneously solve mu ...

2023

Essays in Empirical Asset Pricing

Alexis Arilès Marchal

This thesis consists of three applications of machine learning techniques to empirical asset pricing.In the first part, which is co-authored work with Oksana Bashchenko, we develop a new method that detects jumps nonparametrically in financial time series ...

EPFL2022

Global information processing in feedforward deep networks

Michael Herzog, Ben Henrik Lönnqvist, Adrien Christophe Doerig, Alban Bornet

While deep neural networks are state-of-the-art models of many parts of the human visual system, here we show that they fail to process global information in a humanlike manner. First, using visual crowding as a probe into global visual information process ...

2022

Improving the Training of Compact Neural Networks for Visual Recognition

Shuxuan Guo

During the Artificial Intelligence (AI) revolution of the past decades, deep neural networks have been widely used and have achieved tremendous success in visual recognition. Unfortunately, deploying deep models is challenging because of their huge model s ...

EPFL2022

Stop Wasting my FLOPS: Improving the Efficiency of Deep Learning Models

Angelos Katharopoulos

Deep neural networks have completely revolutionized the field of machinelearning by achieving state-of-the-art results on various tasks ranging fromcomputer vision to protein folding. However, their application is hindered bytheir large computational and m ...

EPFL2022

Towards Verifiable, Generalizable and Efficient Robust Deep Neural Networks.

Chen Liu

In the last decade, deep neural networks have achieved tremendous success in many fields of machine learning.However, they are shown vulnerable against adversarial attacks: well-designed, yet imperceptible, perturbations can make the state-of-the-art deep ...

EPFL2022

Efficient Transformer-Based Speech Recognition

Apoorv Vyas

Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousands of hours of transcribed data, limiting their use to only a few languages. Moreover, current state-of-the-art acoustic models are based on the Transformer ...

EPFL2022