Kandinsky 2.2 fails to load #5044

vladmandic · 2023-09-14T15:40:28Z

Describe the bug

i've tried both DiffusionPipeline and AutoPipelineForText2Image and Kandinsky 2.2 fails to load:

sd_model = diffusers.DiffusionPipeline.from_pretrained('models/Diffusers/models--kandinsky-community--kandinsky-2-2-decoder')
sd_model = diffusers.AutoPipelineForText2Image.from_pretrained('models/Diffusers/models--kandinsky-community--kandinsky-2-2-decoder')

error is:

pipeline_kandinsky2_2_combined.KandinskyV22CombinedPipeline'> 
expected {'prior_image_processor', 'prior_image_encoder', 'prior_tokenizer', 'prior_prior', 'prior_text_encoder', 'prior_scheduler', 'movq', 'unet', 'scheduler'}, 
but only {'unet', 'movq', 'scheduler'} were passed.

note that kandinsky 2.1 works fine (and so do most other models)

Reproduction

import os
import torch
import diffusers

cache_dir = '/home/vlado/dev/sdnext/models/Diffusers'
model_path = 'models--kandinsky-community--kandinsky-2-2-decoder/snapshots/824f8d584960715056dc509c119d6a33cde34889'

pipe = diffusers.AutoPipelineForText2Image.from_pretrained(os.path.join(cache_dir, model_path), torch_dtype=torch.float16, cache_dir=cache_dir)

Logs

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/vlado/dev/sdnext/cli/test-diffuser.py:8 in <module>                                        │
│                                                                                                  │
│    5 cache_dir = '/home/vlado/dev/sdnext/models/Diffusers'                                       │
│    6 model_path = 'models--kandinsky-community--kandinsky-2-2-decoder/snapshots/824f8d5849607    │
│    7                                                                                             │
│ ❱  8 pipe = diffusers.AutoPipelineForText2Image.from_pretrained(os.path.join(cache_dir, model    │
│    9 pipe = pipe.to("cuda")                                                                      │
│   10                                                                                             │
│   11 prompt = "portrait of a young women, blue eyes, cinematic"                                  │
│                                                                                                  │
│ /home/vlado/.local/lib/python3.11/site-packages/diffusers/pipelines/auto_pipeline.py:331 in      │
│ from_pretrained                                                                                  │
│                                                                                                  │
│   328 │   │   text_2_image_cls = _get_task_class(AUTO_TEXT2IMAGE_PIPELINES_MAPPING, orig_class   │
│   329 │   │                                                                                      │
│   330 │   │   kwargs = {**load_config_kwargs, **kwargs}                                          │
│ ❱ 331 │   │   return text_2_image_cls.from_pretrained(pretrained_model_or_path, **kwargs)        │
│   332 │                                                                                          │
│   333 │   @classmethod                                                                           │
│   334 │   def from_pipe(cls, pipeline, **kwargs):                                                │
│                                                                                                  │
│ /home/vlado/.local/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py:1197 in    │
│ from_pretrained                                                                                  │
│                                                                                                  │
│   1194 │   │   │   │   init_kwargs[module] = passed_class_obj.get(module, None)                  │
│   1195 │   │   elif len(missing_modules) > 0:                                                    │
│   1196 │   │   │   passed_modules = set(list(init_kwargs.keys()) + list(passed_class_obj.keys()  │
│ ❱ 1197 │   │   │   raise ValueError(                                                             │
│   1198 │   │   │   │   f"Pipeline {pipeline_class} expected {expected_modules}, but only {passe  │
│   1199 │   │   │   )                                                                             │
│   1200                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯

System Info

diffusers version: 0.21.1
Platform: Linux-6.1.21.2-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python version: 3.11.1
PyTorch version (GPU?): 2.1.0.dev20230903+cu121 (True)
Huggingface_hub version: 0.16.4
Transformers version: 4.31.0
Accelerate version: 0.20.3
xFormers version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help?

@patrickvonplaten @sayakpaul @williamberman

The text was updated successfully, but these errors were encountered:

patrickvonplaten · 2023-09-14T16:53:58Z

Thanks for the nice issue!

The following works for me:

from diffusers import DiffusionPipeline
import torch

cache_dir = DiffusionPipeline.download("kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16, variant="fp16", load_connected_pipeline=True)
pipe = DiffusionPipeline.from_pretrained(cache_dir, load_connected_pipeline=True)

where cache_dir is a local path to "/home/patrick/.cache/huggingface/hub/models--kandinsky-community--kandinsky-2-2-decoder/snapshots/44eb212bdc717528a43a52b61beeb2fd98766fe4"

Could you try passing load_connected_pipeline=True? (We should always pass this for UIs I think)

Also what could happen here is that you have an older version of Kandinsky on your local disk, could you maybe try re-downloading it?

Please let me know if this works - if yes, I'll make sure we document load_connected_pipeline better

vladmandic · 2023-09-14T20:06:56Z

load_connected_pipeline was already set, i just forgot to include in the example above.
i wasn't aware that there was an update to model itself, i just deleted my copy and re-downloaded it - it works fine.

## [[Kandinsky2.2 训练支持 · Issue #268 · PaddlePaddle/PaddleMIX](https://github.com/PaddlePaddle/PaddleMIX/issues/268)](https://github.com/PaddlePaddle/PaddleMIX/issues/268) ### 1 前200steps loss对齐结果： - decoder w/o LoRA: ![decoder](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/ac52377b-5522-4ffb-8ea8-3ad73668cbc5) - prior w/o LoRA: ![prior](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/af24f7c2-2618-4db0-bdaf-764f72f47c9a) - decoder with LoRA: ![decoder_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/231573c1-9d7c-46da-8b16-592a22d248af) - prior with LoRA: ![prior_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/79c166d9-0a08-48b6-84e0-3c802e857ff9) - decoder finue-tune 3k steps results(prompts: A robot pokemon, 4k photo): ![robot-pokemon](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/a7e8ef2d-08b1-4ef2-80d8-826704340de2) ### 2 其他修改 [[ppdiffusers/models/attention_processor.py/LoRAAttnAddedKVProcessor.call](https://github.com/PaddlePaddle/PaddleMIX/blob/ff0d2f25c79cc6e34e7d9c071328a7ed8bea4bc3/ppdiffusers/ppdiffusers/models/attention_processor.py#L789C57-L789C79)] : axis = 1 -> axis = 2 修改原因：运行python train_text_to_image_decoder_lora.py使用LoRAAttnAddedKVProcessor出现concat拼接维度错误。 ### 3 对齐说明 - 关闭diffusers和ppdiffusers中dataloader中的shuffle，保证数据顺序一致; - 设置同一随机种子，并将在trainning loop中造成随机性的noise和timesteps改为由numpy生成统一随机结果(提交代码已删除该逻辑)。 ### 4 存在问题 - 在ppdiffusers中使用AutoPipelineForText2Image(args.pretrained_decoder_model_name_or_path)出现组件缺失： ```bash ValueError: Pipeline <class 'ppdiffusers.pipelines.kandinsky2_2.pipeline_kandinsky2_2_combined.KandinskyV22CombinedPipeline'> expected {'unet', 'prior_image_processor', 'prior_text_encoder', 'prior_image_encoder', 'movq', 'prior_prior', 'prior_scheduler', 'prior_tokenizer', 'scheduler'}, but only {'unet', 'movq', 'scheduler'} were passed. ``` 只能识别部分组件，无法像diffusers自动识别所有组件。故在提交代码中采取下策：在AutoPipelineForText2Image前逐个定义好后传入，不够简洁。目前原因未定，看到一个[[diffusers的issue](https://github.com/PaddlePaddle/PaddleMIX/compare/%5Bhttps://github.com/huggingface/diffusers/issues/5044)]([https://github.com/huggingface/diffusers/issues/5044)与该问题类似。 - 使用pip install ppdiffusers=0.19.4 在下载prior的LoRA权重时会出现PriorTransformer找不到load_attn_procs, 无法使用pipeline.prior_prior.load_attn_procs(args.output_dir)，但使用最新develop分支构建ppdiffusers安装包则不会出现这个问题。 ----------期待回复与关于合入的建议, Thx :)------------------ --------- Co-authored-by: Tsaiyue <tsaiyue01@gamil.com>

## [[Kandinsky2.2 训练支持 · Issue PaddlePaddle#268 · PaddlePaddle/PaddleMIX](https://github.com/PaddlePaddle/PaddleMIX/issues/268)](https://github.com/PaddlePaddle/PaddleMIX/issues/268) ### 1 前200steps loss对齐结果： - decoder w/o LoRA: ![decoder](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/ac52377b-5522-4ffb-8ea8-3ad73668cbc5) - prior w/o LoRA: ![prior](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/af24f7c2-2618-4db0-bdaf-764f72f47c9a) - decoder with LoRA: ![decoder_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/231573c1-9d7c-46da-8b16-592a22d248af) - prior with LoRA: ![prior_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/79c166d9-0a08-48b6-84e0-3c802e857ff9) - decoder finue-tune 3k steps results(prompts: A robot pokemon, 4k photo): ![robot-pokemon](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/a7e8ef2d-08b1-4ef2-80d8-826704340de2) ### 2 其他修改 [[ppdiffusers/models/attention_processor.py/LoRAAttnAddedKVProcessor.call](https://github.com/PaddlePaddle/PaddleMIX/blob/ff0d2f25c79cc6e34e7d9c071328a7ed8bea4bc3/ppdiffusers/ppdiffusers/models/attention_processor.py#L789C57-L789C79)] : axis = 1 -> axis = 2 修改原因：运行python train_text_to_image_decoder_lora.py使用LoRAAttnAddedKVProcessor出现concat拼接维度错误。 ### 3 对齐说明 - 关闭diffusers和ppdiffusers中dataloader中的shuffle，保证数据顺序一致; - 设置同一随机种子，并将在trainning loop中造成随机性的noise和timesteps改为由numpy生成统一随机结果(提交代码已删除该逻辑)。 ### 4 存在问题 - 在ppdiffusers中使用AutoPipelineForText2Image(args.pretrained_decoder_model_name_or_path)出现组件缺失： ```bash ValueError: Pipeline <class 'ppdiffusers.pipelines.kandinsky2_2.pipeline_kandinsky2_2_combined.KandinskyV22CombinedPipeline'> expected {'unet', 'prior_image_processor', 'prior_text_encoder', 'prior_image_encoder', 'movq', 'prior_prior', 'prior_scheduler', 'prior_tokenizer', 'scheduler'}, but only {'unet', 'movq', 'scheduler'} were passed. ``` 只能识别部分组件，无法像diffusers自动识别所有组件。故在提交代码中采取下策：在AutoPipelineForText2Image前逐个定义好后传入，不够简洁。目前原因未定，看到一个[[diffusers的issue](https://github.com/PaddlePaddle/PaddleMIX/compare/%5Bhttps://github.com/huggingface/diffusers/issues/5044)]([https://github.com/huggingface/diffusers/issues/5044)与该问题类似。 - 使用pip install ppdiffusers=0.19.4 在下载prior的LoRA权重时会出现PriorTransformer找不到load_attn_procs, 无法使用pipeline.prior_prior.load_attn_procs(args.output_dir)，但使用最新develop分支构建ppdiffusers安装包则不会出现这个问题。 ----------期待回复与关于合入的建议, Thx :)------------------ --------- Co-authored-by: Tsaiyue <tsaiyue01@gamil.com>

vladmandic added the bug Something isn't working label Sep 14, 2023

vladmandic closed this as completed Sep 14, 2023

Tsaiyue mentioned this issue Jan 10, 2024

[ppdiffusers] Kandinsky2_2 trainning support PaddlePaddle/PaddleMIX#378

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kandinsky 2.2 fails to load #5044

Kandinsky 2.2 fails to load #5044

vladmandic commented Sep 14, 2023

patrickvonplaten commented Sep 14, 2023

vladmandic commented Sep 14, 2023

Kandinsky 2.2 fails to load #5044

Kandinsky 2.2 fails to load #5044

Comments

vladmandic commented Sep 14, 2023

Describe the bug

Reproduction

Logs

System Info

Who can help?

patrickvonplaten commented Sep 14, 2023

vladmandic commented Sep 14, 2023