Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture by the instructor covers the topic of second-order model compression for massive deep neural networks, focusing on models with billions of parameters like OpenAI's GPT-3. The lecture discusses the challenges of running these massive models, the concept of model compression, and practical examples of compressing models to run on single GPUs. Various pruning techniques and their impact on model accuracy are explored, along with the introduction of the M-FAC pruning approach. The lecture concludes with insights on compressing GPT models by up to 10 times with minimal loss and potential performance gains.