This lecture covers the use of Molecular Transformer models for multi-step synthesis planning, including retro and forward steps, path scoring, and hyper-graph exploration strategies. It also delves into the application of self-supervised training, attention weights visualization, atom-mapping discovery, and benchmarking studies. The lecture further explores the role of AI in synthesis planning, focusing on template-based and graph neural network approaches, as well as SMILES-to-SMILES methods. It concludes with insights on applying language methods to chemistry, data-driven reaction fingerprints, and the transformative impact of Transformers in capturing the grammar of chemical reactions.