Lecture

Second-Order Model Compression

Description

This lecture by the instructor covers the topic of second-order model compression for massive deep neural networks, focusing on models with billions of parameters like OpenAI's GPT-3. The lecture discusses the challenges of running these massive models, the concept of model compression, and practical examples of compressing models to run on single GPUs. Various pruning techniques and their impact on model accuracy are explored, along with the introduction of the M-FAC pruning approach. The lecture concludes with insights on compressing GPT models by up to 10 times with minimal loss and potential performance gains.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.