Fix custom ops loading in diffusers #1655

dsocek · 2024-12-20T20:51:34Z

What does this PR do?

This PR has critical fix for custom ops loading in diffusers.

More information

As discussed in PR #1631, removing htcore import before model loading would break quantization support for OH diffusers; however, @skaulintel encountered issues with some workloads when htcore is imported before model is loaded. The underlying issue happens to be due to how GaudiConfig is handled when custom ops precision lists are defined in the configuration (e.g. in Habana/stable-diffusion-2).

This PR fixes the underlying issue.

optimum/habana/diffusers/pipelines/pipeline_utils.py

splotnikv

LGTM

regisss · 2024-12-23T10:41:25Z

@dsocek Is this linked to #1657 ?

dsocek · 2024-12-23T14:09:15Z

@dsocek Is this linked to #1657 ?

@regisss Yes, here's how:

#1657 is a temporary, less intrusive, and partial fix that can be implemented with minimal validation and testing, ensuring it doesn’t delay the release.

This PR, on the other hand, is the complete fix. However, @libinta recommended postponing it until after the release, as it requires extensive validation.

Signed-off-by: Daniel Socek <daniel.socek@intel.com>

imangohari1

Daniel and I are working on this PR to reproduce the issue with released 1.19.0-561 driver/container.

optimum/habana/diffusers/pipelines/pipeline_utils.py

dsocek · 2025-01-09T21:29:37Z

Daniel and I are working on this PR to reproduce the issue with released 1.19.0-561 driver/container.

Issue can be reproduced if quantization is used with autocasting (i.e. without forcing BF16 using --bf16) and config which defines custom autocast op lists (for example Habana/stable-diffusion-2).

For example, you will see issue with:

QUANT_CONFIG=quantization/flux/measure_config.json python text_to_image_generation.py \
     --model_name_or_path black-forest-labs/FLUX.1-dev \
     --prompts "A cat holding a sign that says hello world" \
     --num_images_per_prompt 10 \
     --batch_size 1 \
     --num_inference_steps 30 \
     --image_save_dir /tmp/flux_1_images \
     --scheduler flow_match_euler_discrete \
     --use_habana \
     --use_hpu_graphs \
     --gaudi_config Habana/stable-diffusion-2 \
     --sdp_on_bf16 \
     --quant_mode measure

Basically you will get error:

RuntimeError: Setting bf16/fp32 ops for Torch Autocast but `habana_frameworks.torch.core` has already been imported. You should instantiate your Gaudi config and your training arguments before importing from `habana_frameworks.torch` or calling a method from `optimum.habana.utils`.

We resolved the issue by ensuring that if custom autocast operations are defined in the configuration, the runtime variable is set before importing htcore and loading the model. For quantization, htcore must be imported before loading the model. The previous approach, where htcore was not imported before the model, and where custom ops runtime variables were set after model was loaded is incompatible with quantization support. This PR addresses the problem and ensures full compatibility.

@imangohari1 reply to your question is same as before: (line 169 is now handled with lines 371-373)

optimum/habana/diffusers/pipelines/pipeline_utils.py

Signed-off-by: Daniel Socek <daniel.socek@intel.com>

imangohari1 · 2025-01-10T23:36:56Z

I submitted an internal CI job for this PR: CI#444.
It is progressing now. Will check the results next week.

imangohari1 · 2025-01-13T21:31:57Z

Daniel and I are working on this PR to reproduce the issue with released 1.19.0-561 driver/container.

Issue can be reproduced if quantization is used with autocasting (i.e. without forcing BF16 using --bf16) and config which defines custom autocast op lists (for example Habana/stable-diffusion-2).

For example, you will see issue with:
QUANT_CONFIG=quantization/flux/measure_config.json python text_to_image_generation.py \
     --model_name_or_path black-forest-labs/FLUX.1-dev \
     --prompts "A cat holding a sign that says hello world" \
     --num_images_per_prompt 10 \
     --batch_size 1 \
     --num_inference_steps 30 \
     --image_save_dir /tmp/flux_1_images \
     --scheduler flow_match_euler_discrete \
     --use_habana \
     --use_hpu_graphs \
     --gaudi_config Habana/stable-diffusion-2 \
     --sdp_on_bf16 \
     --quant_mode measure
Basically you will get error:
RuntimeError: Setting bf16/fp32 ops for Torch Autocast but `habana_frameworks.torch.core` has already been imported. You should instantiate your Gaudi config and your training arguments before importing from `habana_frameworks.torch` or calling a method from `optimum.habana.utils`.
We resolved the issue by ensuring that if custom autocast operations are defined in the configuration, the runtime variable is set before importing htcore and loading the model. For quantization, htcore must be imported before loading the model. The previous approach, where htcore was not imported before the model, and where custom ops runtime variables were set after model was loaded is incompatible with quantization support. This PR addresses the problem and ensures full compatibility.

@imangohari1 reply to your question is same as before: (line 169 is now handled with lines 371-373)

I can reproduce this issue on the OH main, and it does NOT happen with this fixes 👍 Thanks.

imangohari1

I have ran this PR changes manually and it looks fine.
have also ran this PR with internal CI #444 and the results looked good.
it should be good.

@regisss final review on your end. Thanks.

…-custom-ops-loading

Signed-off-by: Daniel Socek <daniel.socek@intel.com> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

dsocek requested a review from regisss as a code owner December 20, 2024 20:51

libinta reviewed Dec 20, 2024

View reviewed changes

optimum/habana/diffusers/pipelines/pipeline_utils.py Show resolved Hide resolved

splotnikv approved these changes Dec 20, 2024

View reviewed changes

Fix custom ops loading in diffusers

e80b854

Signed-off-by: Daniel Socek <daniel.socek@intel.com>

dsocek force-pushed the fix-diffusers-custom-ops-loading branch from 3782271 to e80b854 Compare January 3, 2025 23:29

imangohari1 reviewed Jan 9, 2025

View reviewed changes

optimum/habana/diffusers/pipelines/pipeline_utils.py Show resolved Hide resolved

This was referenced Jan 9, 2025

Remove the workaround #1686

Closed

Align diffusers CI tests with examples #1679

Merged

yafshar reviewed Jan 10, 2025

View reviewed changes

optimum/habana/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved

yafshar reviewed Jan 10, 2025

View reviewed changes

optimum/habana/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved

imangohari1 mentioned this pull request Jan 10, 2025

fea(): reworked the 8x hpu skipping strategy #1694

Merged

3 tasks

Fix gaudi config type handling

e8dad7d

Signed-off-by: Daniel Socek <daniel.socek@intel.com>

imangohari1 approved these changes Jan 13, 2025

View reviewed changes

libinta added the run-test Run CI for PRs from external contributors label Jan 16, 2025

Merge remote-tracking branch 'optimum-habana/main' into fix-diffusers…

bd91d31

…-custom-ops-loading

regisss approved these changes Jan 17, 2025

View reviewed changes

regisss merged commit d01f8bb into huggingface:main Jan 17, 2025
4 checks passed

Liangyx2 pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Jan 20, 2025

Fix custom ops loading in diffusers (huggingface#1655)

1ec0ac5

Signed-off-by: Daniel Socek <daniel.socek@intel.com> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix custom ops loading in diffusers #1655

Fix custom ops loading in diffusers #1655

dsocek commented Dec 20, 2024

splotnikv left a comment

regisss commented Dec 23, 2024

dsocek commented Dec 23, 2024

imangohari1 left a comment

dsocek commented Jan 9, 2025 •

edited

Loading

imangohari1 commented Jan 10, 2025

imangohari1 commented Jan 13, 2025

imangohari1 left a comment •

edited

Loading

Fix custom ops loading in diffusers #1655

Fix custom ops loading in diffusers #1655

Conversation

dsocek commented Dec 20, 2024

What does this PR do?

More information

splotnikv left a comment

Choose a reason for hiding this comment

regisss commented Dec 23, 2024

dsocek commented Dec 23, 2024

imangohari1 left a comment

Choose a reason for hiding this comment

dsocek commented Jan 9, 2025 • edited Loading

imangohari1 commented Jan 10, 2025

imangohari1 commented Jan 13, 2025

imangohari1 left a comment • edited Loading

Choose a reason for hiding this comment

dsocek commented Jan 9, 2025 •

edited

Loading

imangohari1 left a comment •

edited

Loading