Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add meta device initialization for pretrained models, 5x faster load times #501

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ErwannMillon
Copy link

Uses accelerates init_empty_weights context manager to initialize models on the meta device (create empty dummy tensors without allocating memory or spending time initializing weights randomly.)

Checks whether accelerator is installed, and uses the default open_clip behavior if the package is unavailable so that accelerate is an optional dependency.

Loads CLIP ViT-H-14 in about 4 seconds.

@rwightman
Copy link
Collaborator

@ErwannMillon thanks for the PR, most of the meta device code/logic is in torch itself, I believe (correct me if I'm wrong) the lines of code to implement the rest would be less than the accelerate import guards so would rather just do it natively if possible

@ErwannMillon
Copy link
Author

Sure, just removed the accelerate dep

@ErwannMillon
Copy link
Author

Not sure what's failing,
this was just a quick pr I made because the feature was something I needed for my work. Don't have the time to dig into this right now but might be a good first issue for someone else

@rwightman
Copy link
Collaborator

@ErwannMillon k, it can definitely be useful, especially as the models get larger. Aside from test failing, there are a few things I want to verify and could probably clean it up / compact it a bit. We can leave it as a draft for now?

There's also the possibility of doing it the way pytorch was intending, https://pytorch.org/docs/stable/generated/torch.nn.utils.skip_init.html#torch.nn.utils.skip_init .. but requires modifying all models to accept device args and pass them through which is a bit meh, but then the context manager approach is a bit of a glorious hack and has possibility of breakage with changes in pytorch.

@ErwannMillon
Copy link
Author

Sure,no worries, thanks for taking the time to look at it

@rwightman rwightman marked this pull request as draft April 20, 2023 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants