Align diffusers CI tests with examples #1679

dsocek · 2025-01-06T20:00:02Z

What does this PR do?

Comprehensive fixes and alignment between CI and examples README.md for diffusers

1st Commit re-organizes sections in README for better clarity and structure, and removes a number of over-represented samples (e.g. for controlnet and inpainting). We also now have text-to-video integrated here rather than in standalone examples folder. Please see improved README doc:
https://github.com/dsocek/optimum-habana/tree/align-diffusers-ci-tests-with-examples/examples/stable-diffusion
2nd Commit modifies tests/test_diffusers.py and closes a gap between README and ci. A number of over-represented tests were removed and all slow tests were aligned to parameters used in README. Also, some tests were merged into single test for efficiency improvement (e.g. separate tests which tested functionality and performance now merged to single test which runs pipeline once and then tests both)

README.md

examples/stable-diffusion/README.md

docs/source/index.mdx

examples/stable-diffusion/README.md

yafshar · 2025-01-09T22:05:28Z

@dsocek thanks for this PR. Great work. Have you run the CI for this PR? Please post the CI results and do the slow test with your modification. I am getting some failure testing.

yafshar · 2025-01-09T22:41:37Z

The custom op issue has already been addressed in this pull request #1655

dsocek · 2025-01-10T00:27:22Z

@yafshar all your changes are incorporated (I squashed many of these into 1 commit), and I also added commit which fixes the quant issue that you found. Thanks for great review!

yafshar · 2025-01-10T17:10:45Z

@dsocek thanks! Would you please post the test results? I am still seeing some issue.

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest tests/test_diffusers.py -s -v

optimum/habana/diffusers/pipelines/pipeline_utils.py

yafshar · 2025-01-12T13:45:03Z

@dsocek thanks! Would you please post the test results? Everything else sounds great to me. We can ask regiss to check the PR.

Signed-off-by: Daniel Socek <daniel.socek@intel.com>

Co-authored-by: Yaser Afshar <yaser.afshar@intel.com>

Signed-off-by: Daniel Socek <daniel.socek@intel.com>

dsocek · 2025-01-17T00:30:45Z

@yafshar Seems there were some issues on 1.19 (I originally tested only 1.18), related to SDP on BF16. After PT 2.5 we now have SDP in full precision as opposed to bf16. I added proper enablement for all diffusers and updated tests to use bf16 sdp. Now all (slow and fast) diffusers tests pass on both 1.18 and 1.19.

export RUN_SLOW=true && \
export GAUDI2_CI=1 && \
make test_installs && \
python -m pytest tests/test_diffusers.py -v

Output:
tests/test_diffusers.py::GaudiPipelineUtilsTester::test_default PASSED                                                                                                                    [  0%]
tests/test_diffusers.py::GaudiPipelineUtilsTester::test_device PASSED                                                                                                                     [  1%]
tests/test_diffusers.py::GaudiPipelineUtilsTester::test_gaudi_config_raise_error_without_habana PASSED                                                                                    [  1%]
tests/test_diffusers.py::GaudiPipelineUtilsTester::test_gaudi_config_types PASSED                                                                                                         [  2%]
tests/test_diffusers.py::GaudiPipelineUtilsTester::test_save_pretrained PASSED                                                                                                            [  3%]
tests/test_diffusers.py::GaudiPipelineUtilsTester::test_use_hpu_graphs PASSED                                                                                                             [  3%]
tests/test_diffusers.py::GaudiPipelineUtilsTester::test_use_hpu_graphs_raise_error_without_habana PASSED                                                                                  [  4%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_no_generation_regression_ldm3d SKIPPED (test requires custom bf16 ops)                                                  [  5%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_no_generation_regression_upscale PASSED                                                                                 [  5%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_no_throughput_regression_autocast SKIPPED (test requires custom bf16 ops)                                               [  6%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_no_throughput_regression_bf16 PASSED                                                                                    [  7%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_stable_diffusion_batch_sizes PASSED                                                                                     [  7%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_stable_diffusion_bf16 PASSED                                                                                            [  8%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_stable_diffusion_ddim PASSED                                                                                            [  9%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_stable_diffusion_default PASSED                                                                                         [  9%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_stable_diffusion_hpu_graphs PASSED                                                                                      [ 10%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_stable_diffusion_no_safety_checker PASSED                                                                               [ 11%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_stable_diffusion_num_images_per_prompt PASSED                                                                           [ 11%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_stable_diffusion_output_types_0_pil PASSED                                                                              [ 12%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_stable_diffusion_output_types_1_np PASSED                                                                               [ 13%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_stable_diffusion_output_types_2_latent PASSED                                                                           [ 13%]
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_sd_textual_inversion SKIPPED (system does not have 8 cards)                                                             [ 14%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_batch_sizes PASSED                                                                                [ 15%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_bf16 PASSED                                                                                       [ 15%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_default PASSED                                                                                    [ 16%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_euler PASSED                                                                                      [ 16%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_euler_ancestral PASSED                                                                            [ 17%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_generation_throughput PASSED                                                                      [ 18%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_hpu_graphs PASSED                                                                                 [ 18%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_num_images_per_prompt PASSED                                                                      [ 19%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_output_types_0_pil PASSED                                                                         [ 20%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_output_types_1_np PASSED                                                                          [ 20%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_output_types_2_latent PASSED                                                                      [ 21%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_stable_diffusion_xl_turbo_euler_ancestral PASSED                                                                      [ 22%]
tests/test_diffusers.py::GaudiStableDiffusionXLPipelineTester::test_sdxl_textual_inversion PASSED                                                                                         [ 22%]
tests/test_diffusers.py::GaudiStableDiffusion3PipelineTester::test_fused_qkv_projections PASSED                                                                                           [ 23%]
tests/test_diffusers.py::GaudiStableDiffusion3PipelineTester::test_stable_diffusion_3_different_negative_prompts PASSED                                                                   [ 24%]
tests/test_diffusers.py::GaudiStableDiffusion3PipelineTester::test_stable_diffusion_3_different_prompts PASSED                                                                            [ 24%]
tests/test_diffusers.py::GaudiStableDiffusion3PipelineTester::test_stable_diffusion_3_prompt_embeds PASSED                                                                                [ 25%]
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_batch_sizes PASSED                                                                [ 26%]
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_bf16 PASSED                                                                       [ 26%]
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_default PASSED                                                                    [ 27%]
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_hpu_graphs PASSED                                                                 [ 28%]
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_num_images_per_prompt PASSED                                                      [ 28%]
tests/test_diffusers.py::GaudiStableDiffusionMultiControlNetPipelineTester::test_stable_diffusion_multicontrolnet_batch_sizes PASSED                                                      [ 29%]
tests/test_diffusers.py::GaudiStableDiffusionMultiControlNetPipelineTester::test_stable_diffusion_multicontrolnet_bf16 PASSED                                                             [ 30%]
tests/test_diffusers.py::GaudiStableDiffusionMultiControlNetPipelineTester::test_stable_diffusion_multicontrolnet_default PASSED                                                          [ 30%]
tests/test_diffusers.py::GaudiStableDiffusionMultiControlNetPipelineTester::test_stable_diffusion_multicontrolnet_hpu_graphs PASSED                                                       [ 31%]
tests/test_diffusers.py::GaudiStableDiffusionMultiControlNetPipelineTester::test_stable_diffusion_multicontrolnet_num_images_per_prompt PASSED                                            [ 32%]
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline PASSED                                                                                      [ 32%]
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_batch PASSED                                                                                [ 33%]
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_bf16 PASSED                                                                                 [ 33%]
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_default PASSED                                                                              [ 34%]
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_hpu_graphs PASSED                                                                           [ 35%]
tests/test_diffusers.py::TrainTextToImage::test_train_text_to_image_script PASSED                                                                                                         [ 35%]
tests/test_diffusers.py::TrainTextToImage::test_train_text_to_image_sdxl PASSED                                                                                                           [ 36%]
tests/test_diffusers.py::TrainControlNet::test_train_controlnet SKIPPED (system does not have 8 cards)                                                                                    [ 37%]
tests/test_diffusers.py::TrainControlNet::test_script_train_controlnet PASSED                                                                                                             [ 37%]
tests/test_diffusers.py::DreamBooth::test_dreambooth_full PASSED                                                                                                                          [ 38%]
tests/test_diffusers.py::DreamBooth::test_dreambooth_full_with_text_encoder PASSED                                                                                                        [ 39%]
tests/test_diffusers.py::DreamBooth::test_dreambooth_loha PASSED                                                                                                                          [ 39%]
tests/test_diffusers.py::DreamBooth::test_dreambooth_loha_with_text_encoder PASSED                                                                                                        [ 40%]
tests/test_diffusers.py::DreamBooth::test_dreambooth_lokr PASSED                                                                                                                          [ 41%]
tests/test_diffusers.py::DreamBooth::test_dreambooth_lokr_with_text_encoder PASSED                                                                                                        [ 41%]
tests/test_diffusers.py::DreamBooth::test_dreambooth_lora PASSED                                                                                                                          [ 42%]
tests/test_diffusers.py::DreamBooth::test_dreambooth_lora_with_text_encoder PASSED                                                                                                        [ 43%]
tests/test_diffusers.py::DreamBooth::test_dreambooth_oft PASSED                                                                                                                           [ 43%]
tests/test_diffusers.py::DreamBooth::test_dreambooth_oft_with_text_encoder PASSED                                                                                                         [ 44%]
tests/test_diffusers.py::DreamBoothLoRASDXL::test_dreambooth_lora_sdxl PASSED                                                                                                             [ 45%]
tests/test_diffusers.py::DreamBoothLoRASDXL::test_dreambooth_lora_sdxl_with_text_encoder PASSED                                                                                           [ 45%]
tests/test_diffusers.py::GaudiStableVideoDiffusionPipelineTester::test_stable_video_diffusion_no_throughput_regression_bf16 PASSED                                                        [ 46%]
tests/test_diffusers.py::GaudiStableVideoDiffusionPipelineTester::test_stable_video_diffusion_single_video PASSED                                                                         [ 47%]
tests/test_diffusers.py::GaudiStableVideoDiffusionControlNetPipelineTester::test_stable_video_diffusion_single_video PASSED                                                               [ 47%]
tests/test_diffusers.py::GaudiStableDiffusionInstructPix2PixPipelineTests::test_stable_diffusion_pix2pix_default_case PASSED                                                              [ 48%]
tests/test_diffusers.py::GaudiStableDiffusionInstructPix2PixPipelineTests::test_stable_diffusion_pix2pix_euler PASSED                                                                     [ 49%]
tests/test_diffusers.py::GaudiStableDiffusionInstructPix2PixPipelineTests::test_stable_diffusion_pix2pix_multiple_init_images PASSED                                                      [ 49%]
tests/test_diffusers.py::GaudiStableDiffusionInstructPix2PixPipelineTests::test_stable_diffusion_pix2pix_negative_prompt PASSED                                                           [ 50%]
tests/test_diffusers.py::GaudiStableDiffusionImg2ImgPipelineTests::test_stable_diffusion_img2img_default_case PASSED                                                                      [ 50%]
tests/test_diffusers.py::GaudiStableDiffusionImg2ImgPipelineTests::test_stable_diffusion_img2img_multiple_init_images PASSED                                                              [ 51%]
tests/test_diffusers.py::GaudiStableDiffusionImg2ImgPipelineTests::test_stable_diffusion_img2img_negative_prompt PASSED                                                                   [ 52%]
tests/test_diffusers.py::GaudiStableDiffusionImageVariationPipelineTests::test_stable_diffusion_img_variation_default_case PASSED                                                         [ 52%]
tests/test_diffusers.py::GaudiStableDiffusionImageVariationPipelineTests::test_stable_diffusion_img_variation_multiple_images PASSED                                                      [ 53%]
tests/test_diffusers.py::GaudiStableDiffusionXLImg2ImgPipelineTests::test_components_function PASSED                                                                                      [ 54%]
tests/test_diffusers.py::GaudiStableDiffusionXLImg2ImgPipelineTests::test_stable_diffusion_xl_img2img_euler PASSED                                                                        [ 54%]
tests/test_diffusers.py::GaudiTextToVideoSDPipelineTester::test_stable_video_diffusion_no_latency_regression_bf16 PASSED                                                                  [ 55%]
tests/test_diffusers.py::GaudiTextToVideoSDPipelineTester::test_text_to_video_default_case PASSED                                                                                         [ 56%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_attention_slicing_forward_pass PASSED                                                                                  [ 56%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_callback_cfg PASSED                                                                                                    [ 57%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_callback_inputs PASSED                                                                                                 [ 58%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_cfg PASSED                                                                                                             [ 58%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_components_function PASSED                                                                                             [ 59%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_dict_tuple_outputs_equivalent PASSED                                                                                   [ 60%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_inference_batch_consistent PASSED                                                                                      [ 61%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_inference_batch_single_identical PASSED                                                                                [ 62%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_karras_schedulers_shape PASSED                                                                                         [ 62%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_latents_input PASSED                                                                                                   [ 63%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_num_images_per_prompt PASSED                                                                                           [ 64%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_pipeline_call_signature PASSED                                                                                         [ 65%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_progress_bar PASSED                                                                                                    [ 66%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_pt_np_pil_inputs_equivalent PASSED                                                                                     [ 66%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_pt_np_pil_outputs_equivalent PASSED                                                                                    [ 67%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_stable_diffusion_inpaint PASSED                                                                                        [ 69%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_stable_diffusion_inpaint_no_throughput_regression PASSED                                                               [ 69%]
tests/test_diffusers.py::StableDiffusionInpaintPipelineTests::test_to_dtype PASSED                                                                                                        [ 71%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_attention_slicing_forward_pass PASSED                                                                                [ 72%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_callback_cfg PASSED                                                                                                  [ 73%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_callback_inputs PASSED                                                                                               [ 73%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_cfg PASSED                                                                                                           [ 74%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_components_function PASSED                                                                                           [ 75%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_dict_tuple_outputs_equivalent PASSED                                                                                 [ 75%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_inference_batch_consistent PASSED                                                                                    [ 77%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_inference_batch_single_identical PASSED                                                                              [ 77%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_latents_input PASSED                                                                                                 [ 78%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_num_images_per_prompt PASSED                                                                                         [ 79%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_pipeline_call_signature PASSED                                                                                       [ 80%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_pipeline_interrupt PASSED                                                                                            [ 81%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_progress_bar PASSED                                                                                                  [ 81%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_pt_np_pil_inputs_equivalent PASSED                                                                                   [ 82%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_pt_np_pil_outputs_equivalent PASSED                                                                                  [ 83%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_two_xl_mixture_of_denoiser_fast PASSED                                                              [ 84%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_xl_img2img_negative_conditions PASSED                                                               [ 85%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_xl_inpaint_2_images PASSED                                                                          [ 86%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_xl_inpaint_euler PASSED                                                                             [ 86%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_xl_inpaint_euler_lcm PASSED                                                                         [ 87%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_xl_inpaint_euler_lcm_custom_timesteps PASSED                                                        [ 88%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_xl_inpaint_mask_latents PASSED                                                                      [ 88%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_xl_inpaint_negative_prompt_embeds PASSED                                                            [ 89%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_xl_inpaint_no_throughput_regression PASSED                                                          [ 90%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_xl_multi_prompts PASSED                                                                             [ 90%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_stable_diffusion_xl_refiner PASSED                                                                                   [ 91%]
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineTests::test_to_dtype PASSED                                                                                                      [ 92%]
tests/test_diffusers.py::GaudiDDPMPipelineTester::test_ddpmpipline_batch_sizes PASSED                                                                                                     [ 94%]
tests/test_diffusers.py::GaudiDDPMPipelineTester::test_ddpmpipline_bf16 PASSED                                                                                                            [ 94%]
tests/test_diffusers.py::GaudiDDPMPipelineTester::test_ddpmpipline_default PASSED                                                                                                         [ 95%]
tests/test_diffusers.py::GaudiDDPMPipelineTester::test_ddpmpipline_hpu_graphs PASSED                                                                                                      [ 96%]
tests/test_diffusers.py::GaudiDDPMPipelineTester::test_no_throughput_regression_bf16 PASSED                                                                                               [ 96%]
tests/test_diffusers.py::GaudiFluxPipelineTester::test_flux_different_prompts PASSED                                                                                                      [ 97%]
tests/test_diffusers.py::GaudiFluxPipelineTester::test_flux_inference PASSED                                                                                                              [ 98%]
tests/test_diffusers.py::GaudiFluxPipelineTester::test_flux_prompt_embeds PASSED                                                                                                          [ 98%]
tests/test_diffusers.py::GaudiFluxImg2ImgPipelineTester::test_flux_different_prompts PASSED                                                                                               [ 99%]
tests/test_diffusers.py::GaudiFluxImg2ImgPipelineTester::test_flux_prompt_embeds PASSED                                                                                                   [100%]

Multi-card tests:

python -m pytest tests/test_diffusers.py -v -k "test_sd_textual_inversion"
Output:
tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_sd_textual_inversion PASSED                                                                                             [100%]

python -m pytest tests/test_diffusers.py -v -k "test_sd_textual_inversion"
Output:
tests/test_diffusers.py::TrainControlNet::test_train_controlnet PASSED                                                                                                                    [100%]

yafshar · 2025-01-17T17:00:03Z

@dsocek Thank you for your excellent work! I'll complete my review as soon as possible.

yafshar · 2025-01-17T19:10:04Z

@dsocek I am still getting some random fails from ''textual_inversion" test, but I think that overall everything is great. We should merge this PR and fix anything else later.

[rank0]:   File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/weight_sharing.py", line 75, in __torch_function__
[rank0]:     return super().__torch_function__(func, types, new_args, kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/hpu/__init__.py", line 81, in _lazy_init
[rank0]:     _hpu_C.init()
[rank0]: RuntimeError: synStatus=8 [Device not found] Device acquire failed.

yafshar

LGTM!

Hi @regisss, this PR is ready for your final review. Could you please take a look?

dsocek · 2025-01-17T19:24:08Z

@dsocek I am still getting some random fails from ''textual_inversion" test, but I think that overall everything is great. We should merge this PR and fix anything else later.

[rank0]:   File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/weight_sharing.py", line 75, in __torch_function__
[rank0]:     return super().__torch_function__(func, types, new_args, kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/hpu/__init__.py", line 81, in _lazy_init
[rank0]:     _hpu_C.init()
[rank0]: RuntimeError: synStatus=8 [Device not found] Device acquire failed.

@yafshar actually this is some issue which I noticed. The CI will fail if you run multi-card tests with other tests. However, if you run it alone, it will pass. Should be investigated and fixed independently of this PR

regisss · 2025-01-27T13:07:21Z

@dsocek I am still getting some random fails from ''textual_inversion" test, but I think that overall everything is great. We should merge this PR and fix anything else later.
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/weight_sharing.py", line 75, in __torch_function__
[rank0]:     return super().__torch_function__(func, types, new_args, kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/hpu/__init__.py", line 81, in _lazy_init
[rank0]:     _hpu_C.init()
[rank0]: RuntimeError: synStatus=8 [Device not found] Device acquire failed.
@yafshar actually this is some issue which I noticed. The CI will fail if you run multi-card tests with other tests. However, if you run it alone, it will pass. Should be investigated and fixed independently of this PR

Yes, this is something I have noticed too and that's actually why the Diffusers slow tests are not run with one command but several specific ones:

optimum-habana/Makefile

Line 101 in 3c251d0

python -m pytest tests/test_diffusers.py -v -s -k "test_textual_inversion"

tests/test_diffusers.py

HuggingFaceDocBuilderDev · 2025-01-27T17:29:15Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

regisss

LGTM!

regisss · 2025-01-31T14:20:17Z

@dsocek It seems the loss at the end of the training of the textual inversion test (regular SD, not SDXL) is NaN. It is not caused by this PR and it happens with previous commits too. Any idea why this could happen? It doesn't happen with SDXL.

dsocek requested a review from regisss as a code owner January 6, 2025 20:00

yafshar reviewed Jan 8, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

yafshar reviewed Jan 8, 2025

View reviewed changes