LaMBERT: Light and Multigranular BERT

Pre-training complex language models is essential for the success of the recent methods such as BERT or OpenAI GPT. Their size makes not only the pre-training phase, but also consecutive applications to be computationally expensive. BERT-like models excel at token-level tasks as they provide reliable token embeddings, but they fall short when it comes to sentence or higher-level structure embeddings. The reason is that these models do not have a built-in mechanism that explicitly provides such representations. We introduce Light and Multigranural BERT that has similar complexity to BERT in the number of parameters, but is about 3 times faster by modifying the input representation, which consequently introduces changes to the attention mechanism and at the same time produces reliable segment embeddings as it is one of our training objectives. The model we publish achieves 70.7% on the MNLI task, which is promising bearing in mind there were two major issues with it.

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

LaMBERT: Light and Multigranular BERT

Graph Chatbot

Chattez avec Graph Search

An unnatural exception. Profile: Piet Oudolf

FANOK: Knockoffs in Linear Time

Computation Of A 30750-Bit Binary Field Discrete Logarithm

FANOK: Knockoffs in Linear Time

Computation Of A 30750-Bit Binary Field Discrete Logarithm

An unnatural exception. Profile: Piet Oudolf