·2 min
DiffusionBlocks: A New Method for Block-by-Block Neural Network Training That Reduces Memory by Several Times
DiffusionBlocks splits neural networks into independent blocks for sequential training, reducing memory from linear in network depth to proportional to a single block. Validated across ViT, DiT, autoregressive Transformers and more.
Read more →