-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GGUF loader for FluxTransformer2DModel #9487
Comments
Perhaps after #9213. Note that exotic FPX schemes are already supported (FP6, FP5, FP4) with torchao. Check out this repo for that: https://github.com/sayakpaul/diffusers-torchao |
yes, i'm following that pr closely :) |
Yeah for sure. Thanks for following along! |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Right up our alley. Cc: @DN6 |
@sayakpaul @DN6 if you want to take a look... simple implementation of generic gguf loader that loads and from there its simple to create diffusers class - i later use it to create |
Can you provide a simple demo to use the gguf format model in diffusers? I do not know exactly how to use the gguf model.
I use above code but get error info
|
|
I get the same error, do you solve it ? |
Set dtype = torch.bfloat16 in this demo. |
Hello, I use the gguf q5 model,but the GPU memory usage is higher. Is your GPU memory reduced? |
Any progress here? |
I successfully loaded the weights in gguf format, but only models with 0,1 suffixes work, those with K,S suffixes do not. (diffusers-0.31.0-dev)
|
@zhaowendao30 thanks for this! Could you maybe modify your comment to include |
|
Thanks! And the checkpoint you used? |
https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main, only models with 0,1 suffixes work, those with K,S suffixes do not. |
just one word of caution - the code relies on |
Being worked in #9964 |
Closing since #9964 was merged. Feel free to reopen if there are any issues. |
GGUF is becoming a preferred means of distribution of FLUX fine-tunes.
Transformers recently added general support for GGUF and are slowly adding support for additional model types.
(implementation is by adding
gguf_file
param tofrom_pretrained
method)This PR adds support for loading GGUF files to
T5EncoderModel
.I've tested the code with quants available at https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf/tree/main and its working with current Flux implementation in diffusers.
However, as
FluxTransformer2DModel
is defined in diffusers library, support has to be added here to be able to load actual transformer model which is most (if not all) of Flux finetunes.Examples that can be used:
with weights quantized as q4_0, q4_1, q5_0, q5_1
with weights simply converted from f16
cc: @yiyixuxu @sayakpaul @DN6
The text was updated successfully, but these errors were encountered: