[Examples] fix checkpointing and casting bugs in `train_text_to_image_lora_sdxl.py` #4632

sayakpaul · 2023-08-16T11:21:53Z

The checkpointing bug is also evident in #4566. Once this PR is merged will reflect that changes in the LoRA DreamBooth SDXL script too.

HuggingFaceDocBuilderDev · 2023-08-16T11:28:46Z

The documentation is not available anymore as the PR was closed or merged.

xiankgx · 2023-08-16T23:58:05Z

@sayakpaul

This keyword parameter is unused.

diffusers/examples/text_to_image/train_text_to_image_lora_sdxl.py

Lines 399 to 408 in dc89891

    
           parser.add_argument( 
        
               "--prior_generation_precision", 
        
               type=str, 
        
               default=None, 
        
               choices=["no", "fp32", "fp16", "bf16"], 
        
               help=( 
        
                   "Choose prior generation precision between fp32, fp16 and bf16 (bfloat16). Bf16 requires PyTorch >=" 
        
                   " 1.10.and an Nvidia Ampere GPU.  Default to  fp16 if a GPU is available else fp32." 
        
               ), 
        
           )

Text encoder one and two are conditionally accelerate prepared, but then are accelerate unwrapped no matter if they are accelerate prepared or not. Not sure if there is any issue for this.

diffusers/examples/text_to_image/train_text_to_image_lora_sdxl.py

Lines 931 to 939 in dc89891

    
           # Prepare everything with our `accelerator`. 
        
           if args.train_text_encoder: 
        
               unet, text_encoder_one, text_encoder_two, optimizer, train_dataloader, lr_scheduler = accelerator.prepare( 
        
                   unet, text_encoder_one, text_encoder_two, optimizer, train_dataloader, lr_scheduler 
        
               ) 
        
           else: 
        
               unet, optimizer, train_dataloader, lr_scheduler = accelerator.prepare( 
        
                   unet, optimizer, train_dataloader, lr_scheduler 
        
               )

diffusers/examples/text_to_image/train_text_to_image_lora_sdxl.py

Lines 1167 to 1168 in dc89891

    
           text_encoder=accelerator.unwrap_model(text_encoder_one), 
        
           text_encoder_2=accelerator.unwrap_model(text_encoder_two),

We accumulate the gradients with accelerate.accumulate(unet). If we enable training the text encoder, is this still valid?

diffusers/examples/text_to_image/train_text_to_image_lora_sdxl.py

Line 1008 in dc89891

with accelerator.accumulate(unet):

This recreating of text encoders seemed unnecessary.

diffusers/examples/text_to_image/train_text_to_image_lora_sdxl.py

Lines 1156 to 1163 in dc89891

    
           # create pipeline 
        
           if not args.train_text_encoder: 
        
               text_encoder_one = text_encoder_cls_one.from_pretrained( 
        
                   args.pretrained_model_name_or_path, subfolder="text_encoder", revision=args.revision 
        
               ) 
        
               text_encoder_two = text_encoder_cls_two.from_pretrained( 
        
                   args.pretrained_model_name_or_path, subfolder="text_encoder_2", revision=args.revision 
        
               )

If we flip an image, do we take coordinates of the flipped top-left (now top right), or do we take the top-left (previously top right)
https://github.com/huggingface/diffusers/blob/dc898919e1c10ee1c8348261860dcd41d56e040b/examples/text_to_image/train_text_to_image_lora_sdxl.py#L872C2-L873v

sayakpaul · 2023-08-17T03:51:25Z

Text encoder one and two are conditionally accelerate prepared, but then are accelerate unwrapped no matter if they are accelerate prepared or not. Not sure if there is any issue for this.

Unwrapping can be a no op in case where it's not supposed to take effect. So, shouldn't be a problem.

This recreating of text encoders seemed unnecessary.

Yes, you're totally right. Since, we're precomputing the text embeddings here. Will reflect this in the PR.

We accumulate the gradients with accelerate.accumulate(unet). If we enable training the text encoder, is this still valid?

Ccing @muellerzr here.

This keyword parameter is unused.

Will remove. Thanks for flagging!

If we flip an image, do we take coordinates of the flipped top-left (now top right), or do we take the top-left (previously top right)

I think the current implementation accounts for both if I am reading correctly. If you think otherwise, maybe provide a few visual examples? Also ccing @okotaku here.

Thanks so much for providing feedback!

xiankgx · 2023-08-17T05:45:26Z

Text encoder one and two are conditionally accelerate prepared, but then are accelerate unwrapped no matter if they are accelerate prepared or not. Not sure if there is any issue for this.

Unwrapping can be a no op in case where it's not supposed to take effect. So, shouldn't be a problem.

This recreating of text encoders seemed unnecessary.

Yes, you're totally right. Since, we're precomputing the text embeddings here. Will reflect this in the PR.

We accumulate the gradients with accelerate.accumulate(unet). If we enable training the text encoder, is this still valid?

Ccing @muellerzr here.

This keyword parameter is unused.

Will remove. Thanks for flagging!

I think the current implementation accounts for both if I am reading correctly. If you think otherwise, maybe provide a few visual examples? Also ccing @okotaku here.

Thanks so much for providing feedback!

If we flip an image, do we take coordinates of the flipped top-left (now top right), or do we take the top-left (previously top right)

sayakpaul · 2023-08-17T06:13:47Z

I think we should consider the original state here hence, x1 = image.width - x1.

stpic270 · 2023-11-15T13:09:47Z

Hii, I still get this error using my own data and dunno how to fix it yet
REPRODUCTION
! accelerate launch /content/diffusers/examples/text_to_image/train_text_to_image_lora_sdxl.py
--pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0"
--pretrained_vae_model_name_or_path='madebyollin/sdxl-vae-fp16-fix'
--train_data_dir="/content/EN_DATA"
--caption_column="text"
--resolution=512 --random_flip
--gradient_accumulation_steps=11
--train_batch_size=3
--num_train_epochs=1 --checkpointing_steps=10 --checkpoints_total_limit=3
--learning_rate=3e-04 --lr_scheduler="constant" --lr_warmup_steps=0
--mixed_precision="fp16"
--validation_prompt="Beutiful comet in a cartoon style"
--seed=42
--report_to="wandb"
--output_dir='/content/output_dir_5_epochs'
--resume_from_checkpoint 'latest'
--use_8bit_adam

ERROR

…_lora_sdxl.py` (huggingface#4632) * fix: casting issues. * fix checkpointing. * tests * fix: bugs

sayakpaul added 3 commits August 16, 2023 15:28

fix: casting issues.

f4ed5e8

fix checkpointing.

b7d2395

tests

dc89891

sayakpaul requested a review from williamberman August 16, 2023 11:21

sayakpaul added 2 commits August 17, 2023 12:43

Merge branch 'main' into fix/sdxl-lora-training-script

2ae43ca

fix: bugs

54f3015

williamberman approved these changes Aug 22, 2023

View reviewed changes

sayakpaul merged commit 4909b1e into main Aug 23, 2023

sayakpaul deleted the fix/sdxl-lora-training-script branch August 23, 2023 05:29

This was referenced Aug 24, 2023

train_text_to_image_lora_sdxl.py throws error on --resume_from_checkpoint #4584

Closed

[Examples] fix sdxl dreambooth lora checkpointing. #4749

Merged

haofanwang mentioned this pull request Jan 12, 2024

Fix a bug of flip in SDXL training script #6547

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Examples] fix checkpointing and casting bugs in `train_text_to_image_lora_sdxl.py` #4632

[Examples] fix checkpointing and casting bugs in `train_text_to_image_lora_sdxl.py` #4632

Uh oh!

sayakpaul commented Aug 16, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Aug 16, 2023 •

edited

Loading

Uh oh!

xiankgx commented Aug 16, 2023 •

edited

Loading

Uh oh!

sayakpaul commented Aug 17, 2023 •

edited

Loading

Uh oh!

xiankgx commented Aug 17, 2023

Uh oh!

sayakpaul commented Aug 17, 2023

Uh oh!

stpic270 commented Nov 15, 2023

Uh oh!

Uh oh!

[Examples] fix checkpointing and casting bugs in train_text_to_image_lora_sdxl.py #4632

[Examples] fix checkpointing and casting bugs in train_text_to_image_lora_sdxl.py #4632

Uh oh!

Conversation

sayakpaul commented Aug 16, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Aug 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xiankgx commented Aug 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Aug 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xiankgx commented Aug 17, 2023

Uh oh!

sayakpaul commented Aug 17, 2023

Uh oh!

stpic270 commented Nov 15, 2023

Uh oh!

Uh oh!

[Examples] fix checkpointing and casting bugs in `train_text_to_image_lora_sdxl.py` #4632

[Examples] fix checkpointing and casting bugs in `train_text_to_image_lora_sdxl.py` #4632

HuggingFaceDocBuilderDev commented Aug 16, 2023 •

edited

Loading

xiankgx commented Aug 16, 2023 •

edited

Loading

sayakpaul commented Aug 17, 2023 •

edited

Loading