Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix custom ops loading in diffusers #1655

Merged
merged 3 commits into from
Jan 17, 2025

Conversation

dsocek
Copy link
Contributor

@dsocek dsocek commented Dec 20, 2024

What does this PR do?

This PR has critical fix for custom ops loading in diffusers.

More information

As discussed in PR #1631, removing htcore import before model loading would break quantization support for OH diffusers; however, @skaulintel encountered issues with some workloads when htcore is imported before model is loaded. The underlying issue happens to be due to how GaudiConfig is handled when custom ops precision lists are defined in the configuration (e.g. in Habana/stable-diffusion-2).

This PR fixes the underlying issue.

@dsocek dsocek requested a review from regisss as a code owner December 20, 2024 20:51
Copy link
Contributor

@splotnikv splotnikv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@regisss
Copy link
Collaborator

regisss commented Dec 23, 2024

@dsocek Is this linked to #1657 ?

@dsocek
Copy link
Contributor Author

dsocek commented Dec 23, 2024

@dsocek Is this linked to #1657 ?

@regisss Yes, here's how:

#1657 is a temporary, less intrusive, and partial fix that can be implemented with minimal validation and testing, ensuring it doesn’t delay the release.

This PR, on the other hand, is the complete fix. However, @libinta recommended postponing it until after the release, as it requires extensive validation.

Signed-off-by: Daniel Socek <daniel.socek@intel.com>
@dsocek dsocek force-pushed the fix-diffusers-custom-ops-loading branch from 3782271 to e80b854 Compare January 3, 2025 23:29
Copy link
Contributor

@imangohari1 imangohari1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Daniel and I are working on this PR to reproduce the issue with released 1.19.0-561 driver/container.

@dsocek
Copy link
Contributor Author

dsocek commented Jan 9, 2025

Daniel and I are working on this PR to reproduce the issue with released 1.19.0-561 driver/container.

Issue can be reproduced if quantization is used with autocasting (i.e. without forcing BF16 using --bf16) and config which defines custom autocast op lists (for example Habana/stable-diffusion-2).

For example, you will see issue with:

QUANT_CONFIG=quantization/flux/measure_config.json python text_to_image_generation.py \
     --model_name_or_path black-forest-labs/FLUX.1-dev \
     --prompts "A cat holding a sign that says hello world" \
     --num_images_per_prompt 10 \
     --batch_size 1 \
     --num_inference_steps 30 \
     --image_save_dir /tmp/flux_1_images \
     --scheduler flow_match_euler_discrete \
     --use_habana \
     --use_hpu_graphs \
     --gaudi_config Habana/stable-diffusion-2 \
     --sdp_on_bf16 \
     --quant_mode measure

Basically you will get error:

RuntimeError: Setting bf16/fp32 ops for Torch Autocast but `habana_frameworks.torch.core` has already been imported. You should instantiate your Gaudi config and your training arguments before importing from `habana_frameworks.torch` or calling a method from `optimum.habana.utils`.

We resolved the issue by ensuring that if custom autocast operations are defined in the configuration, the runtime variable is set before importing htcore and loading the model. For quantization, htcore must be imported before loading the model. The previous approach, where htcore was not imported before the model, and where custom ops runtime variables were set after model was loaded is incompatible with quantization support. This PR addresses the problem and ensures full compatibility.

@imangohari1 reply to your question is same as before: (line 169 is now handled with lines 371-373)

Signed-off-by: Daniel Socek <daniel.socek@intel.com>
@imangohari1
Copy link
Contributor

I submitted an internal CI job for this PR: CI#444.
It is progressing now. Will check the results next week.

@imangohari1
Copy link
Contributor

Daniel and I are working on this PR to reproduce the issue with released 1.19.0-561 driver/container.

Issue can be reproduced if quantization is used with autocasting (i.e. without forcing BF16 using --bf16) and config which defines custom autocast op lists (for example Habana/stable-diffusion-2).

For example, you will see issue with:

QUANT_CONFIG=quantization/flux/measure_config.json python text_to_image_generation.py \
     --model_name_or_path black-forest-labs/FLUX.1-dev \
     --prompts "A cat holding a sign that says hello world" \
     --num_images_per_prompt 10 \
     --batch_size 1 \
     --num_inference_steps 30 \
     --image_save_dir /tmp/flux_1_images \
     --scheduler flow_match_euler_discrete \
     --use_habana \
     --use_hpu_graphs \
     --gaudi_config Habana/stable-diffusion-2 \
     --sdp_on_bf16 \
     --quant_mode measure

Basically you will get error:

RuntimeError: Setting bf16/fp32 ops for Torch Autocast but `habana_frameworks.torch.core` has already been imported. You should instantiate your Gaudi config and your training arguments before importing from `habana_frameworks.torch` or calling a method from `optimum.habana.utils`.

We resolved the issue by ensuring that if custom autocast operations are defined in the configuration, the runtime variable is set before importing htcore and loading the model. For quantization, htcore must be imported before loading the model. The previous approach, where htcore was not imported before the model, and where custom ops runtime variables were set after model was loaded is incompatible with quantization support. This PR addresses the problem and ensures full compatibility.

@imangohari1 reply to your question is same as before: (line 169 is now handled with lines 371-373)

I can reproduce this issue on the OH main, and it does NOT happen with this fixes 👍 Thanks.

Copy link
Contributor

@imangohari1 imangohari1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have ran this PR changes manually and it looks fine.
have also ran this PR with internal CI #444 and the results looked good.
it should be good.

@regisss final review on your end. Thanks.

@libinta libinta added the run-test Run CI for PRs from external contributors label Jan 16, 2025
@regisss regisss merged commit d01f8bb into huggingface:main Jan 17, 2025
4 checks passed
Liangyx2 pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Jan 20, 2025
Signed-off-by: Daniel Socek <daniel.socek@intel.com>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants