Generative pre-trained transformer

About
Privacy
Disclaimer

Graph Chatbot

Related lectures (31)

Page 2 of 4

Delves into the Transformer architecture, self-attention, and training strategies for machine translation and image recognition.

Pretraining Sequence-to-Sequence Models: BART + T5

Explores pretraining sequence-to-sequence models with BART and T5, discussing transfer learning, fine-tuning, model architectures, tasks, performance comparison, summarization results, and references.

BERT: Pretraining and Applications

Delves into BERT pretraining for transformers, discussing its applications in NLP tasks.

Language Models: From Theory to Computation

Explores the mathematics of language models, covering architecture design, pre-training, and fine-tuning, emphasizing the importance of pre-training and fine-tuning for various tasks.

Chemical Reaction Prediction: Molecular Transformer

Explores chemical reaction prediction using generative models and molecular transformers, emphasizing the importance of molecular language processing and stereochemistry.

Transformers: Full Architecture and Self-Attention Mechanism

Explains the full architecture of Transformers and the self-attention mechanism, highlighting the paradigm shift towards using completely pretrained models.

Chemical Reactions: Transformer Architecture

Explores atom mapping in chemical reactions and the transition to reaction grammar using the transformer architecture.

Model Analysis

Explores neural model analysis in NLP, covering evaluation, probing, and ablation studies to understand model behavior and interpretability.

Vision-Language-Action Models: Training and Applications

Delves into training and applications of Vision-Language-Action models, emphasizing large language models' role in robotic control and the transfer of web knowledge. Results from experiments and future research directions are highlighted.

Modern NLP: Data Collection, Annotation & Biases

Explores data annotation in NLP and the impact of biases on model fine-tuning.