Pretraining Sequence-to-Sequence Models: BART + T5

About
Privacy
Disclaimer

Graph Chatbot

Related lectures (30)

Page 2 of 3

Chemical Reaction Prediction: Molecular Transformer

Explores chemical reaction prediction using generative models and molecular transformers, emphasizing the importance of molecular language processing and stereochemistry.

Data Annotation: Collection and Biases in NLP

Addresses data collection, annotation processes, and biases in natural language processing.

Foundations of Deep Learning: Transformer Architecture Overview

Covers the foundational concepts of deep learning and the Transformer architecture, focusing on neural networks, attention mechanisms, and their applications in sequence modeling tasks.

Transformers: Overview and Self-Attention

Provides an overview of Transformers, self-attention, multi-headed attention, and the Transformer decoder and encoder.

Pre-Training: BiLSTM and Transformer

Delves into pre-training BiLSTM and Transformer models for NLP tasks, showcasing their effectiveness and applications.

Generative Models: Self-Attention and Transformers

Covers generative models with a focus on self-attention and transformers, discussing sampling methods and empirical means.

Contextual Representations: ELMo & BERT

Explores the development of contextualized embeddings in NLP, focusing on ELMo and BERT's advancements and impact on NLP tasks.

Transformer Architecture: The X Gomega

Delves into the Transformer architecture, self-attention, and training strategies for machine translation and image recognition.

Coreference Resolution: Models and Evaluation

Explores coreference resolution models, challenges in scoring spans, graph refinement techniques, state-of-the-art results, and the impact of pretrained Transformers.

Language Models: From Theory to Computation

Explores the mathematics of language models, covering architecture design, pre-training, and fine-tuning, emphasizing the importance of pre-training and fine-tuning for various tasks.