Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kandinsky 2.2 fails to load #5044

Closed
vladmandic opened this issue Sep 14, 2023 · 2 comments
Closed

Kandinsky 2.2 fails to load #5044

vladmandic opened this issue Sep 14, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@vladmandic
Copy link
Contributor

Describe the bug

i've tried both DiffusionPipeline and AutoPipelineForText2Image and Kandinsky 2.2 fails to load:

sd_model = diffusers.DiffusionPipeline.from_pretrained('models/Diffusers/models--kandinsky-community--kandinsky-2-2-decoder')
sd_model = diffusers.AutoPipelineForText2Image.from_pretrained('models/Diffusers/models--kandinsky-community--kandinsky-2-2-decoder')

error is:

pipeline_kandinsky2_2_combined.KandinskyV22CombinedPipeline'> 
expected {'prior_image_processor', 'prior_image_encoder', 'prior_tokenizer', 'prior_prior', 'prior_text_encoder', 'prior_scheduler', 'movq', 'unet', 'scheduler'}, 
but only {'unet', 'movq', 'scheduler'} were passed.

note that kandinsky 2.1 works fine (and so do most other models)

Reproduction

import os
import torch
import diffusers

cache_dir = '/home/vlado/dev/sdnext/models/Diffusers'
model_path = 'models--kandinsky-community--kandinsky-2-2-decoder/snapshots/824f8d584960715056dc509c119d6a33cde34889'

pipe = diffusers.AutoPipelineForText2Image.from_pretrained(os.path.join(cache_dir, model_path), torch_dtype=torch.float16, cache_dir=cache_dir)

Logs

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/vlado/dev/sdnext/cli/test-diffuser.py:8 in <module>                                        │
│                                                                                                  │
│    5 cache_dir = '/home/vlado/dev/sdnext/models/Diffusers'                                       │
│    6 model_path = 'models--kandinsky-community--kandinsky-2-2-decoder/snapshots/824f8d5849607    │
│    7                                                                                             │
│ ❱  8 pipe = diffusers.AutoPipelineForText2Image.from_pretrained(os.path.join(cache_dir, model    │
│    9 pipe = pipe.to("cuda")                                                                      │
│   10                                                                                             │
│   11 prompt = "portrait of a young women, blue eyes, cinematic"                                  │
│                                                                                                  │
│ /home/vlado/.local/lib/python3.11/site-packages/diffusers/pipelines/auto_pipeline.py:331 in      │
│ from_pretrained                                                                                  │
│                                                                                                  │
│   328 │   │   text_2_image_cls = _get_task_class(AUTO_TEXT2IMAGE_PIPELINES_MAPPING, orig_class   │
│   329 │   │                                                                                      │
│   330 │   │   kwargs = {**load_config_kwargs, **kwargs}                                          │
│ ❱ 331 │   │   return text_2_image_cls.from_pretrained(pretrained_model_or_path, **kwargs)        │
│   332 │                                                                                          │
│   333 │   @classmethod                                                                           │
│   334 │   def from_pipe(cls, pipeline, **kwargs):                                                │
│                                                                                                  │
│ /home/vlado/.local/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py:1197 in    │
│ from_pretrained                                                                                  │
│                                                                                                  │
│   1194 │   │   │   │   init_kwargs[module] = passed_class_obj.get(module, None)                  │
│   1195 │   │   elif len(missing_modules) > 0:                                                    │
│   1196 │   │   │   passed_modules = set(list(init_kwargs.keys()) + list(passed_class_obj.keys()  │
│ ❱ 1197 │   │   │   raise ValueError(                                                             │
│   1198 │   │   │   │   f"Pipeline {pipeline_class} expected {expected_modules}, but only {passe  │
│   1199 │   │   │   )                                                                             │
│   1200                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯

System Info

  • diffusers version: 0.21.1
  • Platform: Linux-6.1.21.2-microsoft-standard-WSL2-x86_64-with-glibc2.35
  • Python version: 3.11.1
  • PyTorch version (GPU?): 2.1.0.dev20230903+cu121 (True)
  • Huggingface_hub version: 0.16.4
  • Transformers version: 4.31.0
  • Accelerate version: 0.20.3
  • xFormers version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@patrickvonplaten @sayakpaul @williamberman

@vladmandic vladmandic added the bug Something isn't working label Sep 14, 2023
@patrickvonplaten
Copy link
Contributor

Thanks for the nice issue!

The following works for me:

from diffusers import DiffusionPipeline
import torch

cache_dir = DiffusionPipeline.download("kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16, variant="fp16", load_connected_pipeline=True)
pipe = DiffusionPipeline.from_pretrained(cache_dir, load_connected_pipeline=True)

where cache_dir is a local path to "/home/patrick/.cache/huggingface/hub/models--kandinsky-community--kandinsky-2-2-decoder/snapshots/44eb212bdc717528a43a52b61beeb2fd98766fe4"

Could you try passing load_connected_pipeline=True? (We should always pass this for UIs I think)

Also what could happen here is that you have an older version of Kandinsky on your local disk, could you maybe try re-downloading it?

Please let me know if this works - if yes, I'll make sure we document load_connected_pipeline better

@vladmandic
Copy link
Contributor Author

load_connected_pipeline was already set, i just forgot to include in the example above.
i wasn't aware that there was an update to model itself, i just deleted my copy and re-downloaded it - it works fine.

JunnYu referenced this issue in PaddlePaddle/PaddleMIX Jan 15, 2024
## [[Kandinsky2.2 训练支持 · Issue #268 ·
PaddlePaddle/PaddleMIX](https://github.com/PaddlePaddle/PaddleMIX/issues/268)](https://github.com/PaddlePaddle/PaddleMIX/issues/268)

### 1 前200steps loss对齐结果:

- decoder w/o LoRA:
  

![decoder](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/ac52377b-5522-4ffb-8ea8-3ad73668cbc5)

- prior w/o LoRA:
  

![prior](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/af24f7c2-2618-4db0-bdaf-764f72f47c9a)
  
- decoder with LoRA:
  

![decoder_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/231573c1-9d7c-46da-8b16-592a22d248af)

- prior with LoRA:
  

![prior_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/79c166d9-0a08-48b6-84e0-3c802e857ff9)
  
- decoder finue-tune 3k steps results(prompts: A robot pokemon, 4k
photo):
  

![robot-pokemon](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/a7e8ef2d-08b1-4ef2-80d8-826704340de2)

### 2 其他修改


[[ppdiffusers/models/attention_processor.py/LoRAAttnAddedKVProcessor.call](https://github.com/PaddlePaddle/PaddleMIX/blob/ff0d2f25c79cc6e34e7d9c071328a7ed8bea4bc3/ppdiffusers/ppdiffusers/models/attention_processor.py#L789C57-L789C79)]
: axis = 1 -> axis = 2

修改原因:运行python
train_text_to_image_decoder_lora.py使用LoRAAttnAddedKVProcessor出现concat拼接维度错误。

### 3 对齐说明

- 关闭diffusers和ppdiffusers中dataloader中的shuffle,保证数据顺序一致;
  
- 设置同一随机种子,并将在trainning
loop中造成随机性的noise和timesteps改为由numpy生成统一随机结果(提交代码已删除该逻辑)。
  

### 4 存在问题

-
在ppdiffusers中使用AutoPipelineForText2Image(args.pretrained_decoder_model_name_or_path)出现组件缺失:

```bash
     ValueError: Pipeline <class 'ppdiffusers.pipelines.kandinsky2_2.pipeline_kandinsky2_2_combined.KandinskyV22CombinedPipeline'> expected {'unet', 'prior_image_processor', 'prior_text_encoder', 'prior_image_encoder', 'movq', 'prior_prior', 'prior_scheduler', 'prior_tokenizer', 'scheduler'}, but only {'unet', 'movq', 'scheduler'} were passed.
```

   
只能识别部分组件,无法像diffusers自动识别所有组件。故在提交代码中采取下策:在AutoPipelineForText2Image前逐个定义好后传入,不够简洁。目前原因未定,看到一个[[diffusers的issue](https://github.com/PaddlePaddle/PaddleMIX/compare/%5Bhttps://github.com/huggingface/diffusers/issues/5044)]([https://github.com/huggingface/diffusers/issues/5044)与该问题类似。

- 使用pip install ppdiffusers=0.19.4
在下载prior的LoRA权重时会出现PriorTransformer找不到load_attn_procs,
无法使用pipeline.prior_prior.load_attn_procs(args.output_dir),但使用最新develop分支构建ppdiffusers安装包则不会出现这个问题。

----------期待回复与关于合入的建议, Thx :)------------------

---------

Co-authored-by: Tsaiyue <tsaiyue01@gamil.com>
westfish referenced this issue in westfish/PaddleMIX Sep 25, 2024
## [[Kandinsky2.2 训练支持 · Issue PaddlePaddle#268 ·
PaddlePaddle/PaddleMIX](https://github.com/PaddlePaddle/PaddleMIX/issues/268)](https://github.com/PaddlePaddle/PaddleMIX/issues/268)

### 1 前200steps loss对齐结果:

- decoder w/o LoRA:
  

![decoder](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/ac52377b-5522-4ffb-8ea8-3ad73668cbc5)

- prior w/o LoRA:
  

![prior](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/af24f7c2-2618-4db0-bdaf-764f72f47c9a)
  
- decoder with LoRA:
  

![decoder_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/231573c1-9d7c-46da-8b16-592a22d248af)

- prior with LoRA:
  

![prior_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/79c166d9-0a08-48b6-84e0-3c802e857ff9)
  
- decoder finue-tune 3k steps results(prompts: A robot pokemon, 4k
photo):
  

![robot-pokemon](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/a7e8ef2d-08b1-4ef2-80d8-826704340de2)

### 2 其他修改


[[ppdiffusers/models/attention_processor.py/LoRAAttnAddedKVProcessor.call](https://github.com/PaddlePaddle/PaddleMIX/blob/ff0d2f25c79cc6e34e7d9c071328a7ed8bea4bc3/ppdiffusers/ppdiffusers/models/attention_processor.py#L789C57-L789C79)]
: axis = 1 -> axis = 2

修改原因:运行python
train_text_to_image_decoder_lora.py使用LoRAAttnAddedKVProcessor出现concat拼接维度错误。

### 3 对齐说明

- 关闭diffusers和ppdiffusers中dataloader中的shuffle,保证数据顺序一致;
  
- 设置同一随机种子,并将在trainning
loop中造成随机性的noise和timesteps改为由numpy生成统一随机结果(提交代码已删除该逻辑)。
  

### 4 存在问题

-
在ppdiffusers中使用AutoPipelineForText2Image(args.pretrained_decoder_model_name_or_path)出现组件缺失:

```bash
     ValueError: Pipeline <class 'ppdiffusers.pipelines.kandinsky2_2.pipeline_kandinsky2_2_combined.KandinskyV22CombinedPipeline'> expected {'unet', 'prior_image_processor', 'prior_text_encoder', 'prior_image_encoder', 'movq', 'prior_prior', 'prior_scheduler', 'prior_tokenizer', 'scheduler'}, but only {'unet', 'movq', 'scheduler'} were passed.
```

   
只能识别部分组件,无法像diffusers自动识别所有组件。故在提交代码中采取下策:在AutoPipelineForText2Image前逐个定义好后传入,不够简洁。目前原因未定,看到一个[[diffusers的issue](https://github.com/PaddlePaddle/PaddleMIX/compare/%5Bhttps://github.com/huggingface/diffusers/issues/5044)]([https://github.com/huggingface/diffusers/issues/5044)与该问题类似。

- 使用pip install ppdiffusers=0.19.4
在下载prior的LoRA权重时会出现PriorTransformer找不到load_attn_procs,
无法使用pipeline.prior_prior.load_attn_procs(args.output_dir),但使用最新develop分支构建ppdiffusers安装包则不会出现这个问题。

----------期待回复与关于合入的建议, Thx :)------------------

---------

Co-authored-by: Tsaiyue <tsaiyue01@gamil.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants