Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VAEBackbone and use it for SD3 #1892

Merged

Conversation

james77777778
Copy link
Collaborator

This PR adds the VAE encoder to SD3 and also fixes the summary of Task to recursively parse the info, rather than relying on the layer attribute.

The preset on kaggle has also been updated.

Demo colab:
https://colab.research.google.com/drive/1ucx2lEck1ZO3HVTjZ4qgVsQrBWrHWfVQ?usp=sharing

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Sep 30, 2024
Copy link
Collaborator

@divyashreepathihalli divyashreepathihalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @james77777778, left a few comments.

keras_hub/src/models/stable_diffusion_3/autoencoder.py Outdated Show resolved Hide resolved
return input_shape


class ResNetBlock(keras.layers.Layer):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the ResNet implementation we already have be reused here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normalization layer and activation function are different.

keras_hub/src/models/stable_diffusion_3/autoencoder.py Outdated Show resolved Hide resolved
return output_shape


class AutoEncoder(Backbone):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add this in our model folder.
Create a model folder named autoencoder
add auto_encoder_layers.py - add all the layers and tests there
add autoencoder_backbone.py and add this backbone
add basic backbone tests.
It is fine to not expose it and only use it in stable diffusion

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. This is better!
I have added a folder called vae and I believe it can be also used for FLUX models.

@sampathweb sampathweb added kokoro:force-run Runs Tests on GPU and removed kokoro:force-run Runs Tests on GPU labels Oct 1, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Oct 1, 2024
@james77777778 james77777778 changed the title Add autoencoder to SD3 Add VAEBackbone and use it for SD3 Oct 1, 2024
@james77777778 james77777778 added the kokoro:force-run Runs Tests on GPU label Oct 1, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Oct 1, 2024
@james77777778
Copy link
Collaborator Author

james77777778 commented Oct 1, 2024

@divyashreepathihalli
This PR is ready for review. Please let me know if any further changes are required.

Colab for demo: https://colab.research.google.com/drive/1ucx2lEck1ZO3HVTjZ4qgVsQrBWrHWfVQ

I can add image2image and inpaint after this PR is merged.
The finetuning part is much more complicated. I don't think I will be able to complete it before the official release of KerasHub.

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Oct 1, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Oct 1, 2024
Copy link
Collaborator

@divyashreepathihalli divyashreepathihalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@james77777778 james77777778 added the kokoro:force-run Runs Tests on GPU label Oct 2, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Oct 2, 2024
Copy link
Collaborator

@divyashreepathihalli divyashreepathihalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good! need some clarification on the changes to task.py. Otherwise LGTM!

info = "Audio shape: "
info += highlight_shape(audio_converter.audio_shape())
add_layer(audio_converter, info)
if preprocessor and isinstance(preprocessor, keras.Layer):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please explain why you have made these changes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for a nested preprocessor. I have added a comment for this change:

            # Since the preprocessor might be nested with multiple `Tokenizer`,
            # `ImageConverter`, `AudioConverter` and even other `Preprocessor`
            # instances, we should recursively iterate through them.

keras_hub/src/models/vae/vae_layers.py Outdated Show resolved Hide resolved
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Oct 3, 2024
@divyashreepathihalli divyashreepathihalli merged commit 11227f3 into keras-team:master Oct 3, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants