Delves into Deep Learning for Natural Language Processing, exploring Neural Word Embeddings, Recurrent Neural Networks, and Attentive Neural Modeling with Transformers.
Explores decoding from neural models in modern NLP, covering encoder-decoder models, decoding algorithms, issues with argmax decoding, and the impact of beam size.
Explains the full architecture of Transformers and the self-attention mechanism, highlighting the paradigm shift towards using completely pretrained models.
Explores the Transformer model, from recurrent models to attention-based NLP, highlighting its key components and significant results in machine translation and document generation.