Explores the Transformer model, from recurrent models to attention-based NLP, highlighting its key components and significant results in machine translation and document generation.
Delves into Deep Learning for Natural Language Processing, exploring Neural Word Embeddings, Recurrent Neural Networks, and Attentive Neural Modeling with Transformers.
Explores pretraining sequence-to-sequence models with BART and T5, discussing transfer learning, fine-tuning, model architectures, tasks, performance comparison, summarization results, and references.