-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add VAEBackbone
and use it for SD3
#1892
Add VAEBackbone
and use it for SD3
#1892
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @james77777778, left a few comments.
return input_shape | ||
|
||
|
||
class ResNetBlock(keras.layers.Layer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the ResNet implementation we already have be reused here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The normalization layer and activation function are different.
return output_shape | ||
|
||
|
||
class AutoEncoder(Backbone): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add this in our model folder.
Create a model folder named autoencoder
add auto_encoder_layers.py
- add all the layers and tests there
add autoencoder_backbone.py
and add this backbone
add basic backbone tests.
It is fine to not expose it and only use it in stable diffusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. This is better!
I have added a folder called vae
and I believe it can be also used for FLUX models.
e4dd23c
to
132b886
Compare
VAEBackbone
and use it for SD3
@divyashreepathihalli Colab for demo: https://colab.research.google.com/drive/1ucx2lEck1ZO3HVTjZ4qgVsQrBWrHWfVQ I can add image2image and inpaint after this PR is merged. |
keras_hub/src/models/stable_diffusion_3/flow_match_euler_discrete_scheduler.py
Show resolved
Hide resolved
132b886
to
22db582
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
keras_hub/src/models/stable_diffusion_3/stable_diffusion_3_backbone_test.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look good! need some clarification on the changes to task.py. Otherwise LGTM!
info = "Audio shape: " | ||
info += highlight_shape(audio_converter.audio_shape()) | ||
add_layer(audio_converter, info) | ||
if preprocessor and isinstance(preprocessor, keras.Layer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please explain why you have made these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed for a nested preprocessor. I have added a comment for this change:
# Since the preprocessor might be nested with multiple `Tokenizer`,
# `ImageConverter`, `AudioConverter` and even other `Preprocessor`
# instances, we should recursively iterate through them.
This PR adds the VAE encoder to SD3 and also fixes the summary of
Task
to recursively parse the info, rather than relying on the layer attribute.The preset on kaggle has also been updated.
Demo colab:
https://colab.research.google.com/drive/1ucx2lEck1ZO3HVTjZ4qgVsQrBWrHWfVQ?usp=sharing