You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I try to run the script "train_dreambooth_lora_sd3_miniature.py" with the argument "resume_from_checkpoint" it returns the following error:
Traceback (most recent call last): File "/kaggle/working/./diffusers/examples/research_projects/sd3_lora_colab/train_dreambooth_lora_sd3_miniature.py", line 1150, in <module> main(args) File "/kaggle/working/./diffusers/examples/research_projects/sd3_lora_colab/train_dreambooth_lora_sd3_miniature.py", line 934, in main accelerator.load_state(os.path.join(args.output_dir, path)) File "/opt/conda/lib/python3.10/site-packages/accelerate/accelerator.py", line 3147, in load_state self.step = override_attributes["step"] KeyError: 'step'
Traceback (most recent call last):
File "/kaggle/working/./diffusers/examples/research_projects/sd3_lora_colab/train_dreambooth_lora_sd3_miniature.py", line 1150, in<module>
main(args)
File "/kaggle/working/./diffusers/examples/research_projects/sd3_lora_colab/train_dreambooth_lora_sd3_miniature.py", line 934, in main
accelerator.load_state(os.path.join(args.output_dir, path))
File "/opt/conda/lib/python3.10/site-packages/accelerate/accelerator.py", line 3147, in load_state
self.step = override_attributes["step"]
KeyError: 'step'
System Info
🤗 Diffusers version: 0.30.0.dev0
Platform: Linux-5.15.154+-x86_64-with-glibc2.31
Running on a notebook?: Yes
Running on Google Colab?: No
Python version: 3.10.13
PyTorch version (GPU?): 2.1.2 (True)
Flax version (CPU?/GPU?/TPU?): 0.8.4 (gpu)
Jax version: 0.4.26
JaxLib version: 0.4.26.dev20240504
Huggingface_hub version: 0.23.2
Transformers version: 4.42.3
Accelerate version: 0.32.1
PEFT version: 0.11.1
Bitsandbytes version: 0.43.1
Safetensors version: 0.4.3
xFormers version: not installed
Accelerator: Tesla T4, 15360 MiB
Tesla T4, 15360 MiB VRAM
Using GPU in script?: 2 GPU's
Using distributed or parallel set-up in script?:
Who can help?
No response
The text was updated successfully, but these errors were encountered:
Describe the bug
When I try to run the script "train_dreambooth_lora_sd3_miniature.py" with the argument "resume_from_checkpoint" it returns the following error:
Traceback (most recent call last): File "/kaggle/working/./diffusers/examples/research_projects/sd3_lora_colab/train_dreambooth_lora_sd3_miniature.py", line 1150, in <module> main(args) File "/kaggle/working/./diffusers/examples/research_projects/sd3_lora_colab/train_dreambooth_lora_sd3_miniature.py", line 934, in main accelerator.load_state(os.path.join(args.output_dir, path)) File "/opt/conda/lib/python3.10/site-packages/accelerate/accelerator.py", line 3147, in load_state self.step = override_attributes["step"] KeyError: 'step'
Reproduction
`!accelerate launch ./diffusers/examples/research_projects/sd3_lora_colab/train_dreambooth_lora_sd3_miniature.py
--pretrained_model_name_or_path="stabilityai/stable-diffusion-3-medium-diffusers"
--instance_data_dir="dataset"
--data_df_path="./metadata/parquet/sample_embeddings.parquet"
--output_dir="Output"
--mixed_precision="fp16"
--instance_prompt="the_instace_prompt"
--resolution=1024
--train_batch_size=1
--gradient_accumulation_steps=4 --gradient_checkpointing
--checkpointing_steps=150
--max_train_steps=1200
--use_8bit_adam
--learning_rate=1e-4
--lr_scheduler="constant"
--lr_warmup_steps=0
--resume_from_checkpoint='checkpoint-900' \
--seed="2"
--rank=64
--report_to='wandb'`
Logs
System Info
Tesla T4, 15360 MiB VRAM
Who can help?
No response
The text was updated successfully, but these errors were encountered: