Accelerating Generative AI

YouTube

Description

There is a Cambrian explosion of performant and efficient methods to train and serve generative AI models within the community. The PyTorch team will present optimizations to transformer based Generative AI models, using pure, native PyTorch. In this talk we aim to cover both new techniques in PyTorch for driving efficiency gains, as well as showcasing how they can be composed on popular Generative AI models. Highlights will include methods spanning torch compile, quantization, sparsity, memory efficient attention, reducing padding.

PyVideo

Accelerating Generative AI

Description

Details