Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: Linear.forward() got an unexpected keyword argument 'lora_scale' with this command #879

Closed
2 of 4 tasks
ghost opened this issue Aug 30, 2023 · 4 comments
Closed
2 of 4 tasks

Comments

@ghost
Copy link

ghost commented Aug 30, 2023

System Info

  • diffusers version: 0.21.0.dev0
  • Platform: Linux-5.15.0-1041-aws-x86_64-with-glibc2.31
  • Python version: 3.10.9
  • PyTorch version (GPU?): 2.0.1+cu117 (True)
  • Huggingface_hub version: 0.16.4
  • Transformers version: 4.32.1
  • Accelerate version: 0.22.0
  • xFormers version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@pacman100 @younesbelkada @sayakpaul

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

git clone https://github.com/huggingface/peft

cd peft/examples/lora_dreambooth

pip install -r requirements.txt
pip install git+https://github.com/huggingface/peft

export MODEL_NAME="CompVis/stable-diffusion-v1-4" 
export INSTANCE_DIR="path-to-instance-images"
export CLASS_DIR="path-to-class-images"
export OUTPUT_DIR="path-to-save-model"

accelerate launch train_dreambooth.py   --pretrained_model_name_or_path=$MODEL_NAME  --instance_data_dir=$INSTANCE_DIR  --class_data_dir=$CLASS_DIR  --output_dir=$OUTPUT_DIR  --train_text_encoder   --with_prior_preservation --prior_loss_weight=1.0   --instance_prompt="a photo of sks man"  --class_prompt="a photo of man"  --resolution=512   --train_batch_size=1   --lr_scheduler="constant"  --lr_warmup_steps=0   --num_class_images=200   --use_lora   --lora_r 16   --lora_alpha 27   --lora_text_encoder_r 16   --lora_text_encoder_alpha 17   --learning_rate=1e-4   --gradient_accumulation_steps=1   --gradient_checkpointing   --max_train_steps=500

Expected behavior

accelerate launch works as expected

@BenjaminBossan
Copy link
Member

Could you please paste the full stacktrace?

@ghost
Copy link
Author

ghost commented Aug 30, 2023

Yeah

accelerate launch train_dreambooth.py   --pretrained_model_name_or_path=$MODEL_NAME  --instance_data_dir=$INSTANCE_DIR  --class_data_dir=$CLASS_DIR  --output_dir=$OUTPUT_DIR  --train_text_encoder   --with_prior_preservation --prior_loss_weight=1.0   --instance_prompt="a photo of sks man"  --class_prompt="a photo of man"  --resolution=512   --train_batch_size=1   --lr_scheduler="constant"  --lr_warmup_steps=0   --num_class_images=200   --use_lora   --lora_r 16   --lora_alpha 27   --lora_text_encoder_r 16   --lora_text_encoder_alpha 17   --learning_rate=1e-4   --gradient_accumulation_steps=1   --gradient_checkpointing   --max_train_steps=500
08/30/2023 04:10:30 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda

Mixed precision type: fp16

You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'force_upcast', 'norm_num_groups'} was not found in config. Values will be initialized to default values.
{'encoder_hid_dim', 'dual_cross_attention', 'time_embedding_act_fn', 'addition_embed_type_num_heads', 'only_cross_attention', 'projection_class_embeddings_input_dim', 'time_embedding_dim', 'cross_attention_norm', 'num_class_embeds', 'upcast_attention', 'mid_block_only_cross_attention', 'transformer_layers_per_block', 'resnet_skip_time_act', 'class_embeddings_concat', 'class_embed_type', 'resnet_out_scale_factor', 'conv_out_kernel', 'num_attention_heads', 'encoder_hid_dim_type', 'mid_block_type', 'conv_in_kernel', 'attention_type', 'resnet_time_scale_shift', 'use_linear_projection', 'addition_embed_type', 'time_embedding_type', 'time_cond_proj_dim', 'addition_time_embed_dim', 'timestep_post_act'} was not found in config. Values will be initialized to default values.
trainable params: 1,594,368 || all params: 861,115,332 || trainable%: 0.18515150535027286
PeftModel(
  (base_model): LoraModel(
    (model): UNet2DConditionModel(
      (conv_in): Conv2d(4, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (time_proj): Timesteps()
      (time_embedding): TimestepEmbedding(
        (linear_1): Linear(in_features=320, out_features=1280, bias=True)
        (act): SiLU()
        (linear_2): Linear(in_features=1280, out_features=1280, bias=True)
      )
      (down_blocks): ModuleList(
        (0): CrossAttnDownBlock2D(
          (attentions): ModuleList(
            (0-1): 2 x Transformer2DModel(
              (norm): GroupNorm(32, 320, eps=1e-06, affine=True)
              (proj_in): LoRACompatibleConv(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (transformer_blocks): ModuleList(
                (0): BasicTransformerBlock(
                  (norm1): LayerNorm((320,), eps=1e-05, elementwise_affine=True)
                  (attn1): Attention(
                    (to_q): Linear(
                      in_features=320, out_features=320, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=320, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=320, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=320, out_features=320, bias=False)
                    (to_v): Linear(
                      in_features=320, out_features=320, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=320, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=320, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=320, out_features=320, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm2): LayerNorm((320,), eps=1e-05, elementwise_affine=True)
                  (attn2): Attention(
                    (to_q): Linear(
                      in_features=320, out_features=320, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=320, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=320, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=768, out_features=320, bias=False)
                    (to_v): Linear(
                      in_features=768, out_features=320, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=768, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=320, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=320, out_features=320, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm3): LayerNorm((320,), eps=1e-05, elementwise_affine=True)
                  (ff): FeedForward(
                    (net): ModuleList(
                      (0): GEGLU(
                        (proj): LoRACompatibleLinear(in_features=320, out_features=2560, bias=True)
                      )
                      (1): Dropout(p=0.0, inplace=False)
                      (2): LoRACompatibleLinear(in_features=1280, out_features=320, bias=True)
                    )
                  )
                )
              )
              (proj_out): LoRACompatibleConv(320, 320, kernel_size=(1, 1), stride=(1, 1))
            )
          )
          (resnets): ModuleList(
            (0-1): 2 x ResnetBlock2D(
              (norm1): GroupNorm(32, 320, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=320, bias=True)
              (norm2): GroupNorm(32, 320, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
            )
          )
          (downsamplers): ModuleList(
            (0): Downsample2D(
              (conv): LoRACompatibleConv(320, 320, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
            )
          )
        )
        (1): CrossAttnDownBlock2D(
          (attentions): ModuleList(
            (0-1): 2 x Transformer2DModel(
              (norm): GroupNorm(32, 640, eps=1e-06, affine=True)
              (proj_in): LoRACompatibleConv(640, 640, kernel_size=(1, 1), stride=(1, 1))
              (transformer_blocks): ModuleList(
                (0): BasicTransformerBlock(
                  (norm1): LayerNorm((640,), eps=1e-05, elementwise_affine=True)
                  (attn1): Attention(
                    (to_q): Linear(
                      in_features=640, out_features=640, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=640, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=640, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=640, out_features=640, bias=False)
                    (to_v): Linear(
                      in_features=640, out_features=640, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=640, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=640, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=640, out_features=640, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm2): LayerNorm((640,), eps=1e-05, elementwise_affine=True)
                  (attn2): Attention(
                    (to_q): Linear(
                      in_features=640, out_features=640, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=640, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=640, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=768, out_features=640, bias=False)
                    (to_v): Linear(
                      in_features=768, out_features=640, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=768, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=640, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=640, out_features=640, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm3): LayerNorm((640,), eps=1e-05, elementwise_affine=True)
                  (ff): FeedForward(
                    (net): ModuleList(
                      (0): GEGLU(
                        (proj): LoRACompatibleLinear(in_features=640, out_features=5120, bias=True)
                      )
                      (1): Dropout(p=0.0, inplace=False)
                      (2): LoRACompatibleLinear(in_features=2560, out_features=640, bias=True)
                    )
                  )
                )
              )
              (proj_out): LoRACompatibleConv(640, 640, kernel_size=(1, 1), stride=(1, 1))
            )
          )
          (resnets): ModuleList(
            (0): ResnetBlock2D(
              (norm1): GroupNorm(32, 320, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(320, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=640, bias=True)
              (norm2): GroupNorm(32, 640, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
              (conv_shortcut): LoRACompatibleConv(320, 640, kernel_size=(1, 1), stride=(1, 1))
            )
            (1): ResnetBlock2D(
              (norm1): GroupNorm(32, 640, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=640, bias=True)
              (norm2): GroupNorm(32, 640, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
            )
          )
          (downsamplers): ModuleList(
            (0): Downsample2D(
              (conv): LoRACompatibleConv(640, 640, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
            )
          )
        )
        (2): CrossAttnDownBlock2D(
          (attentions): ModuleList(
            (0-1): 2 x Transformer2DModel(
              (norm): GroupNorm(32, 1280, eps=1e-06, affine=True)
              (proj_in): LoRACompatibleConv(1280, 1280, kernel_size=(1, 1), stride=(1, 1))
              (transformer_blocks): ModuleList(
                (0): BasicTransformerBlock(
                  (norm1): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
                  (attn1): Attention(
                    (to_q): Linear(
                      in_features=1280, out_features=1280, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=1280, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=1280, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=False)
                    (to_v): Linear(
                      in_features=1280, out_features=1280, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=1280, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=1280, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm2): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
                  (attn2): Attention(
                    (to_q): Linear(
                      in_features=1280, out_features=1280, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=1280, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=1280, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=768, out_features=1280, bias=False)
                    (to_v): Linear(
                      in_features=768, out_features=1280, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=768, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=1280, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm3): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
                  (ff): FeedForward(
                    (net): ModuleList(
                      (0): GEGLU(
                        (proj): LoRACompatibleLinear(in_features=1280, out_features=10240, bias=True)
                      )
                      (1): Dropout(p=0.0, inplace=False)
                      (2): LoRACompatibleLinear(in_features=5120, out_features=1280, bias=True)
                    )
                  )
                )
              )
              (proj_out): LoRACompatibleConv(1280, 1280, kernel_size=(1, 1), stride=(1, 1))
            )
          )
          (resnets): ModuleList(
            (0): ResnetBlock2D(
              (norm1): GroupNorm(32, 640, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(640, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
              (norm2): GroupNorm(32, 1280, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
              (conv_shortcut): LoRACompatibleConv(640, 1280, kernel_size=(1, 1), stride=(1, 1))
            )
            (1): ResnetBlock2D(
              (norm1): GroupNorm(32, 1280, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
              (norm2): GroupNorm(32, 1280, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
            )
          )
          (downsamplers): ModuleList(
            (0): Downsample2D(
              (conv): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
            )
          )
        )
        (3): DownBlock2D(
          (resnets): ModuleList(
            (0-1): 2 x ResnetBlock2D(
              (norm1): GroupNorm(32, 1280, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
              (norm2): GroupNorm(32, 1280, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
            )
          )
        )
      )
      (up_blocks): ModuleList(
        (0): UpBlock2D(
          (resnets): ModuleList(
            (0-2): 3 x ResnetBlock2D(
              (norm1): GroupNorm(32, 2560, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(2560, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
              (norm2): GroupNorm(32, 1280, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
              (conv_shortcut): LoRACompatibleConv(2560, 1280, kernel_size=(1, 1), stride=(1, 1))
            )
          )
          (upsamplers): ModuleList(
            (0): Upsample2D(
              (conv): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
            )
          )
        )
        (1): CrossAttnUpBlock2D(
          (attentions): ModuleList(
            (0-2): 3 x Transformer2DModel(
              (norm): GroupNorm(32, 1280, eps=1e-06, affine=True)
              (proj_in): LoRACompatibleConv(1280, 1280, kernel_size=(1, 1), stride=(1, 1))
              (transformer_blocks): ModuleList(
                (0): BasicTransformerBlock(
                  (norm1): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
                  (attn1): Attention(
                    (to_q): Linear(
                      in_features=1280, out_features=1280, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=1280, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=1280, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=False)
                    (to_v): Linear(
                      in_features=1280, out_features=1280, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=1280, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=1280, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm2): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
                  (attn2): Attention(
                    (to_q): Linear(
                      in_features=1280, out_features=1280, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=1280, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=1280, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=768, out_features=1280, bias=False)
                    (to_v): Linear(
                      in_features=768, out_features=1280, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=768, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=1280, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm3): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
                  (ff): FeedForward(
                    (net): ModuleList(
                      (0): GEGLU(
                        (proj): LoRACompatibleLinear(in_features=1280, out_features=10240, bias=True)
                      )
                      (1): Dropout(p=0.0, inplace=False)
                      (2): LoRACompatibleLinear(in_features=5120, out_features=1280, bias=True)
                    )
                  )
                )
              )
              (proj_out): LoRACompatibleConv(1280, 1280, kernel_size=(1, 1), stride=(1, 1))
            )
          )
          (resnets): ModuleList(
            (0-1): 2 x ResnetBlock2D(
              (norm1): GroupNorm(32, 2560, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(2560, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
              (norm2): GroupNorm(32, 1280, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
              (conv_shortcut): LoRACompatibleConv(2560, 1280, kernel_size=(1, 1), stride=(1, 1))
            )
            (2): ResnetBlock2D(
              (norm1): GroupNorm(32, 1920, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(1920, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
              (norm2): GroupNorm(32, 1280, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
              (conv_shortcut): LoRACompatibleConv(1920, 1280, kernel_size=(1, 1), stride=(1, 1))
            )
          )
          (upsamplers): ModuleList(
            (0): Upsample2D(
              (conv): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
            )
          )
        )
        (2): CrossAttnUpBlock2D(
          (attentions): ModuleList(
            (0-2): 3 x Transformer2DModel(
              (norm): GroupNorm(32, 640, eps=1e-06, affine=True)
              (proj_in): LoRACompatibleConv(640, 640, kernel_size=(1, 1), stride=(1, 1))
              (transformer_blocks): ModuleList(
                (0): BasicTransformerBlock(
                  (norm1): LayerNorm((640,), eps=1e-05, elementwise_affine=True)
                  (attn1): Attention(
                    (to_q): Linear(
                      in_features=640, out_features=640, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=640, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=640, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=640, out_features=640, bias=False)
                    (to_v): Linear(
                      in_features=640, out_features=640, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=640, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=640, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=640, out_features=640, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm2): LayerNorm((640,), eps=1e-05, elementwise_affine=True)
                  (attn2): Attention(
                    (to_q): Linear(
                      in_features=640, out_features=640, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=640, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=640, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=768, out_features=640, bias=False)
                    (to_v): Linear(
                      in_features=768, out_features=640, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=768, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=640, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=640, out_features=640, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm3): LayerNorm((640,), eps=1e-05, elementwise_affine=True)
                  (ff): FeedForward(
                    (net): ModuleList(
                      (0): GEGLU(
                        (proj): LoRACompatibleLinear(in_features=640, out_features=5120, bias=True)
                      )
                      (1): Dropout(p=0.0, inplace=False)
                      (2): LoRACompatibleLinear(in_features=2560, out_features=640, bias=True)
                    )
                  )
                )
              )
              (proj_out): LoRACompatibleConv(640, 640, kernel_size=(1, 1), stride=(1, 1))
            )
          )
          (resnets): ModuleList(
            (0): ResnetBlock2D(
              (norm1): GroupNorm(32, 1920, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(1920, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=640, bias=True)
              (norm2): GroupNorm(32, 640, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
              (conv_shortcut): LoRACompatibleConv(1920, 640, kernel_size=(1, 1), stride=(1, 1))
            )
            (1): ResnetBlock2D(
              (norm1): GroupNorm(32, 1280, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(1280, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=640, bias=True)
              (norm2): GroupNorm(32, 640, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
              (conv_shortcut): LoRACompatibleConv(1280, 640, kernel_size=(1, 1), stride=(1, 1))
            )
            (2): ResnetBlock2D(
              (norm1): GroupNorm(32, 960, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(960, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=640, bias=True)
              (norm2): GroupNorm(32, 640, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
              (conv_shortcut): LoRACompatibleConv(960, 640, kernel_size=(1, 1), stride=(1, 1))
            )
          )
          (upsamplers): ModuleList(
            (0): Upsample2D(
              (conv): LoRACompatibleConv(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
            )
          )
        )
        (3): CrossAttnUpBlock2D(
          (attentions): ModuleList(
            (0-2): 3 x Transformer2DModel(
              (norm): GroupNorm(32, 320, eps=1e-06, affine=True)
              (proj_in): LoRACompatibleConv(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (transformer_blocks): ModuleList(
                (0): BasicTransformerBlock(
                  (norm1): LayerNorm((320,), eps=1e-05, elementwise_affine=True)
                  (attn1): Attention(
                    (to_q): Linear(
                      in_features=320, out_features=320, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=320, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=320, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=320, out_features=320, bias=False)
                    (to_v): Linear(
                      in_features=320, out_features=320, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=320, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=320, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=320, out_features=320, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm2): LayerNorm((320,), eps=1e-05, elementwise_affine=True)
                  (attn2): Attention(
                    (to_q): Linear(
                      in_features=320, out_features=320, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=320, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=320, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_k): LoRACompatibleLinear(in_features=768, out_features=320, bias=False)
                    (to_v): Linear(
                      in_features=768, out_features=320, bias=False
                      (lora_dropout): ModuleDict(
                        (default): Identity()
                      )
                      (lora_A): ModuleDict(
                        (default): Linear(in_features=768, out_features=16, bias=False)
                      )
                      (lora_B): ModuleDict(
                        (default): Linear(in_features=16, out_features=320, bias=False)
                      )
                      (lora_embedding_A): ParameterDict()
                      (lora_embedding_B): ParameterDict()
                    )
                    (to_out): ModuleList(
                      (0): LoRACompatibleLinear(in_features=320, out_features=320, bias=True)
                      (1): Dropout(p=0.0, inplace=False)
                    )
                  )
                  (norm3): LayerNorm((320,), eps=1e-05, elementwise_affine=True)
                  (ff): FeedForward(
                    (net): ModuleList(
                      (0): GEGLU(
                        (proj): LoRACompatibleLinear(in_features=320, out_features=2560, bias=True)
                      )
                      (1): Dropout(p=0.0, inplace=False)
                      (2): LoRACompatibleLinear(in_features=1280, out_features=320, bias=True)
                    )
                  )
                )
              )
              (proj_out): LoRACompatibleConv(320, 320, kernel_size=(1, 1), stride=(1, 1))
            )
          )
          (resnets): ModuleList(
            (0): ResnetBlock2D(
              (norm1): GroupNorm(32, 960, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(960, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=320, bias=True)
              (norm2): GroupNorm(32, 320, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
              (conv_shortcut): LoRACompatibleConv(960, 320, kernel_size=(1, 1), stride=(1, 1))
            )
            (1-2): 2 x ResnetBlock2D(
              (norm1): GroupNorm(32, 640, eps=1e-05, affine=True)
              (conv1): LoRACompatibleConv(640, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=320, bias=True)
              (norm2): GroupNorm(32, 320, eps=1e-05, affine=True)
              (dropout): Dropout(p=0.0, inplace=False)
              (conv2): LoRACompatibleConv(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
              (nonlinearity): SiLU()
              (conv_shortcut): LoRACompatibleConv(640, 320, kernel_size=(1, 1), stride=(1, 1))
            )
          )
        )
      )
      (mid_block): UNetMidBlock2DCrossAttn(
        (attentions): ModuleList(
          (0): Transformer2DModel(
            (norm): GroupNorm(32, 1280, eps=1e-06, affine=True)
            (proj_in): LoRACompatibleConv(1280, 1280, kernel_size=(1, 1), stride=(1, 1))
            (transformer_blocks): ModuleList(
              (0): BasicTransformerBlock(
                (norm1): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
                (attn1): Attention(
                  (to_q): Linear(
                    in_features=1280, out_features=1280, bias=False
                    (lora_dropout): ModuleDict(
                      (default): Identity()
                    )
                    (lora_A): ModuleDict(
                      (default): Linear(in_features=1280, out_features=16, bias=False)
                    )
                    (lora_B): ModuleDict(
                      (default): Linear(in_features=16, out_features=1280, bias=False)
                    )
                    (lora_embedding_A): ParameterDict()
                    (lora_embedding_B): ParameterDict()
                  )
                  (to_k): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=False)
                  (to_v): Linear(
                    in_features=1280, out_features=1280, bias=False
                    (lora_dropout): ModuleDict(
                      (default): Identity()
                    )
                    (lora_A): ModuleDict(
                      (default): Linear(in_features=1280, out_features=16, bias=False)
                    )
                    (lora_B): ModuleDict(
                      (default): Linear(in_features=16, out_features=1280, bias=False)
                    )
                    (lora_embedding_A): ParameterDict()
                    (lora_embedding_B): ParameterDict()
                  )
                  (to_out): ModuleList(
                    (0): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
                    (1): Dropout(p=0.0, inplace=False)
                  )
                )
                (norm2): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
                (attn2): Attention(
                  (to_q): Linear(
                    in_features=1280, out_features=1280, bias=False
                    (lora_dropout): ModuleDict(
                      (default): Identity()
                    )
                    (lora_A): ModuleDict(
                      (default): Linear(in_features=1280, out_features=16, bias=False)
                    )
                    (lora_B): ModuleDict(
                      (default): Linear(in_features=16, out_features=1280, bias=False)
                    )
                    (lora_embedding_A): ParameterDict()
                    (lora_embedding_B): ParameterDict()
                  )
                  (to_k): LoRACompatibleLinear(in_features=768, out_features=1280, bias=False)
                  (to_v): Linear(
                    in_features=768, out_features=1280, bias=False
                    (lora_dropout): ModuleDict(
                      (default): Identity()
                    )
                    (lora_A): ModuleDict(
                      (default): Linear(in_features=768, out_features=16, bias=False)
                    )
                    (lora_B): ModuleDict(
                      (default): Linear(in_features=16, out_features=1280, bias=False)
                    )
                    (lora_embedding_A): ParameterDict()
                    (lora_embedding_B): ParameterDict()
                  )
                  (to_out): ModuleList(
                    (0): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
                    (1): Dropout(p=0.0, inplace=False)
                  )
                )
                (norm3): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
                (ff): FeedForward(
                  (net): ModuleList(
                    (0): GEGLU(
                      (proj): LoRACompatibleLinear(in_features=1280, out_features=10240, bias=True)
                    )
                    (1): Dropout(p=0.0, inplace=False)
                    (2): LoRACompatibleLinear(in_features=5120, out_features=1280, bias=True)
                  )
                )
              )
            )
            (proj_out): LoRACompatibleConv(1280, 1280, kernel_size=(1, 1), stride=(1, 1))
          )
        )
        (resnets): ModuleList(
          (0-1): 2 x ResnetBlock2D(
            (norm1): GroupNorm(32, 1280, eps=1e-05, affine=True)
            (conv1): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
            (time_emb_proj): LoRACompatibleLinear(in_features=1280, out_features=1280, bias=True)
            (norm2): GroupNorm(32, 1280, eps=1e-05, affine=True)
            (dropout): Dropout(p=0.0, inplace=False)
            (conv2): LoRACompatibleConv(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
            (nonlinearity): SiLU()
          )
        )
      )
      (conv_norm_out): GroupNorm(32, 320, eps=1e-05, affine=True)
      (conv_act): SiLU()
      (conv_out): Conv2d(320, 4, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
  )
)
trainable params: 589,824 || all params: 123,650,304 || trainable%: 0.4770097451600281
PeftModel(
  (base_model): LoraModel(
    (model): CLIPTextModel(
      (text_model): CLIPTextTransformer(
        (embeddings): CLIPTextEmbeddings(
          (token_embedding): Embedding(49408, 768)
          (position_embedding): Embedding(77, 768)
        )
        (encoder): CLIPEncoder(
          (layers): ModuleList(
            (0-11): 12 x CLIPEncoderLayer(
              (self_attn): CLIPAttention(
                (k_proj): Linear(in_features=768, out_features=768, bias=True)
                (v_proj): Linear(
                  in_features=768, out_features=768, bias=True
                  (lora_dropout): ModuleDict(
                    (default): Identity()
                  )
                  (lora_A): ModuleDict(
                    (default): Linear(in_features=768, out_features=16, bias=False)
                  )
                  (lora_B): ModuleDict(
                    (default): Linear(in_features=16, out_features=768, bias=False)
                  )
                  (lora_embedding_A): ParameterDict()
                  (lora_embedding_B): ParameterDict()
                )
                (q_proj): Linear(
                  in_features=768, out_features=768, bias=True
                  (lora_dropout): ModuleDict(
                    (default): Identity()
                  )
                  (lora_A): ModuleDict(
                    (default): Linear(in_features=768, out_features=16, bias=False)
                  )
                  (lora_B): ModuleDict(
                    (default): Linear(in_features=16, out_features=768, bias=False)
                  )
                  (lora_embedding_A): ParameterDict()
                  (lora_embedding_B): ParameterDict()
                )
                (out_proj): Linear(in_features=768, out_features=768, bias=True)
              )
              (layer_norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
              (mlp): CLIPMLP(
                (activation_fn): QuickGELUActivation()
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
              )
              (layer_norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            )
          )
        )
        (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
      )
    )
  )
)
08/30/2023 04:10:35 - INFO - __main__ - ***** Running training *****
08/30/2023 04:10:35 - INFO - __main__ -   Num examples = 200
08/30/2023 04:10:35 - INFO - __main__ -   Num batches each epoch = 200
08/30/2023 04:10:35 - INFO - __main__ -   Num Epochs = 3
08/30/2023 04:10:35 - INFO - __main__ -   Instantaneous batch size per device = 1
08/30/2023 04:10:35 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 1
08/30/2023 04:10:35 - INFO - __main__ -   Gradient Accumulation steps = 1
08/30/2023 04:10:35 - INFO - __main__ -   Total optimization steps = 500
Steps:   0%|                                                                                                                                                                        | 0/500 [00:00<?, ?it/s]/opt/conda/lib/python3.10/site-packages/torch/cuda/memory.py:303: FutureWarning: torch.cuda.reset_max_memory_allocated now calls torch.cuda.reset_peak_memory_stats, which resets /all/ peak memory stats.
  warnings.warn(
Traceback (most recent call last):
  File "/home/ubuntu/peft/examples/lora_dreambooth/train_dreambooth.py", line 1086, in <module>
    main(args)
  File "/home/ubuntu/peft/examples/lora_dreambooth/train_dreambooth.py", line 928, in main
    model_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/accelerate/utils/operations.py", line 632, in forward
    return model_forward(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/accelerate/utils/operations.py", line 620, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "/opt/conda/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/peft/peft_model.py", line 453, in forward
    return self.get_base_model()(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 930, in forward
    sample, res_samples = downsample_block(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py", line 1043, in forward
    hidden_states = attn(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/diffusers/models/transformer_2d.py", line 299, in forward
    hidden_states = torch.utils.checkpoint.checkpoint(
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 251, in checkpoint
    return _checkpoint_without_reentrant(
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 432, in _checkpoint_without_reentrant
    output = function(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/diffusers/models/attention.py", line 194, in forward
    attn_output = self.attn1(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 419, in forward
    return self.processor(
  File "/opt/conda/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 1018, in __call__
    query = attn.to_q(hidden_states, lora_scale=scale)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
TypeError: Linear.forward() got an unexpected keyword argument 'lora_scale'
Steps:   0%|                                                                                                                                                                        | 0/500 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "/opt/conda/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/opt/conda/lib/python3.10/site-packages/accelerate/commands/launch.py", line 986, in launch_command
    simple_launcher(args)
  File "/opt/conda/lib/python3.10/site-packages/accelerate/commands/launch.py", line 628, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/conda/bin/python3.10', 'train_dreambooth.py', '--pretrained_model_name_or_path=CompVis/stable-diffusion-v1-4', '--instance_data_dir=instance-images/messi/', '--class_data_dir=class-images/messi/', '--output_dir=output/messi', '--train_text_encoder', '--with_prior_preservation', '--prior_loss_weight=1.0', '--instance_prompt=a photo of sks man', '--class_prompt=a photo of man', '--resolution=512', '--train_batch_size=1', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=200', '--use_lora', '--lora_r', '16', '--lora_alpha', '27', '--lora_text_encoder_r', '16', '--lora_text_encoder_alpha', '17', '--learning_rate=1e-4', '--gradient_accumulation_steps=1', '--gradient_checkpointing', '--max_train_steps=500']' returned non-zero exit status 1.

@BenjaminBossan
Copy link
Member

Could you please try if going back to diffusers v0.20 solves the problem? If so, it could be that a recent change in diffusers introduced the issue.

@ghost
Copy link
Author

ghost commented Aug 31, 2023

Ah, that did the trick. Works with v0.20. Thanks! Closing the issue

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant