Add `cpu_offload_with_hook` #1045

sgugger · 2023-02-07T15:36:27Z

This is a feature request for Diffusers. This PR adds a new cpu_offload_with_hook function that will offload the model to CPU, then put it back on the GPU once executed, but without offloading it just after the forward like cpu_offload does. Instead, it's up to the user to call the hook.offload() to offload it again.

Example:

import torch
form accelerate import cpu_offload_with_hook

model = nn.Linear(4, 5)
model, hook = cpu_offload_with_hook(model)
print(model.weight.device) # Always cpu

outputs = model(inputs)
print(outputs.device) # Outputs are on GPU, execution was done on GPU
print(model.weight.device) # Stays on the GPU until hook.offload() is called

hook.offload()
print(model.weight.device) # Back to cpu

cc @pcuenca @patrickvonplaten

HuggingFaceDocBuilderDev · 2023-02-07T15:40:22Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten

This looks very nice! If possible we were also envisioning the possibility to pass a previous user_hook to the CpuOffload hook so that the previous model can be offloaded at forward. E.g. so that the following could be possible:

hook_1 = cpu_offload_with_hook(model_1, cuda_device)
hook_2 = cpu_offload_with_hook(model_2, cuda_device, user_hook=hook_1)
hook_3 = cpu_offload_with_hook(model_3, cuda_device, user_hook=hook_2)

so that the following would automatically work:

hid_1 = model_1(input)
for i in range(50):
     hid_2 = model_2(hid_1)
hid_3 = model_3(hid_3)

Alternatively, we could also call the hooks ourselves if preferred, but if we could directly pass the hooks as shown in the code review that would make diffusers code much nicer I think :-)

src/accelerate/big_modeling.py

src/accelerate/hooks.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

pacman100 · 2023-02-07T18:01:27Z

Really Cool!

patrickvonplaten

Awesome! Thanks a mille for the super fast PR ❤️

sgugger added 3 commits February 7, 2023 10:32

Add cpu offload with hook

cc1f9f2

Style

382d3e8

add to init

7d75430

patrickvonplaten reviewed Feb 7, 2023

View reviewed changes

src/accelerate/big_modeling.py Outdated Show resolved Hide resolved

src/accelerate/big_modeling.py Outdated Show resolved Hide resolved

src/accelerate/hooks.py Outdated Show resolved Hide resolved

src/accelerate/hooks.py Show resolved Hide resolved

sgugger and others added 3 commits February 7, 2023 11:01

Apply suggestions from code review

9ea0d36

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Add documentation

a00ff8c

Add tests

00ecf87

sgugger requested review from patrickvonplaten and muellerzr February 7, 2023 17:49

patrickvonplaten approved these changes Feb 7, 2023

View reviewed changes

sgugger merged commit 71e81ba into main Feb 7, 2023

sgugger deleted the cpu_offload_hook branch February 7, 2023 18:09

patrickvonplaten mentioned this pull request Feb 7, 2023

Manually move UNet to cuda/cpu huggingface/diffusers#2269

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `cpu_offload_with_hook` #1045

Add `cpu_offload_with_hook` #1045

sgugger commented Feb 7, 2023

HuggingFaceDocBuilderDev commented Feb 7, 2023 •

edited

Loading

patrickvonplaten left a comment •

edited

Loading

pacman100 commented Feb 7, 2023

patrickvonplaten left a comment

Add cpu_offload_with_hook #1045

Add cpu_offload_with_hook #1045

Conversation

sgugger commented Feb 7, 2023

HuggingFaceDocBuilderDev commented Feb 7, 2023 • edited Loading

patrickvonplaten left a comment • edited Loading

Choose a reason for hiding this comment

pacman100 commented Feb 7, 2023

patrickvonplaten left a comment

Choose a reason for hiding this comment

Add `cpu_offload_with_hook` #1045

Add `cpu_offload_with_hook` #1045

HuggingFaceDocBuilderDev commented Feb 7, 2023 •

edited

Loading

patrickvonplaten left a comment •

edited

Loading