-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training process fails with a Jax library related issue #255
Comments
Any Update ? facing same issue |
this seems to work:
|
Google always ends up ruining something in each update, you need to use a specific version: This solves the problem for now. |
Indeed
probably the PyTorch version should be fixed too, but which version? |
I have the same problem. Did anyone find a solution? |
@roman19932024 try to update python version to 3.10. |
Describe the bug
Training process fails with a Jax library related issue.
This the the python code in the notebook cell, that fails:
!python3 train_dreambooth.py
--pretrained_model_name_or_path=$MODEL_NAME
--pretrained_vae_name_or_path="stabilityai/sd-vae-ft-mse"
--output_dir=$OUTPUT_DIR
--with_prior_preservation --prior_loss_weight=1.0
--seed=1337
--resolution=512
--train_batch_size=1
--train_text_encoder
--mixed_precision="fp16"
--use_8bit_adam
--gradient_accumulation_steps=1
--learning_rate=1e-6
--lr_scheduler="constant"
--lr_warmup_steps=0
--num_class_images=50
--sample_batch_size=4
--max_train_steps=800
--save_interval=10000
--save_sample_prompt="photo of narrow gate"
--concepts_list="concepts_list.json"
Attached is the screenshot for the error:
Reproduction
Run the training process by issuing the following command:
!python3 train_dreambooth.py
--pretrained_model_name_or_path=$MODEL_NAME
--pretrained_vae_name_or_path="stabilityai/sd-vae-ft-mse"
--output_dir=$OUTPUT_DIR
--with_prior_preservation --prior_loss_weight=1.0
--seed=1337
--resolution=512
--train_batch_size=1
--train_text_encoder
--mixed_precision="fp16"
--use_8bit_adam
--gradient_accumulation_steps=1
--learning_rate=1e-6
--lr_scheduler="constant"
--lr_warmup_steps=0
--num_class_images=50
--sample_batch_size=4
--max_train_steps=800
--save_interval=10000
--save_sample_prompt="photo of narrow gate"
--concepts_list="concepts_list.json"
Logs
No response
System Info
I am running this on a google colab runtime on a python 3 running on a Google compute engine with a Tesla GPU.
Install details:
!wget -q https://github.com/ShivamShrirao/diffusers/raw/main/examples/dreambooth/train_dreambooth.py
!wget -q https://github.com/ShivamShrirao/diffusers/raw/main/scripts/convert_diffusers_to_original_stable_diffusion.py
%pip install -qq git+https://github.com/ShivamShrirao/diffusers
%pip install -q -U --pre triton
%pip install -q accelerate transformers ftfy bitsandbytes==0.35.0 gradio natsort safetensors xformers
The text was updated successfully, but these errors were encountered: