Contribute Media
A thank you to everyone who makes this possible: Read More

New Activation Checkpointing APIs in PyTorch

Description

Activation checkpointing is a commonly used technique to reduce memory usage during model training by reducing the number of activations saved for backward. Instead of keeping tensors needed for backward alive until they are used in gradient computation during backward, those tensors are recomputed during the backward pass. This talk will introduce new activation checkpoint APIs that can help achieve a better trade off between memory savings and compute overhead that recomputing introduces.

Details

Improve this page