Skip to content

Commit 2b30b10

Browse files
authored
Create train_dreambooth_inpaint.py (open-mmlab#1091)
* Create train_dreambooth_inpaint.py train_dreambooth.py adapted to work with the inpaint model, generating random masks during the training * Update train_dreambooth_inpaint.py refactored train_dreambooth_inpaint with black * Update train_dreambooth_inpaint.py * Update train_dreambooth_inpaint.py * Update train_dreambooth_inpaint.py Fix prior preservation * add instructions to readme, fix SD2 compatibility
1 parent 3ad49ee commit 2b30b10

File tree

2 files changed

+866
-0
lines changed

2 files changed

+866
-0
lines changed

examples/dreambooth/README.md

+118
Original file line numberDiff line numberDiff line change
@@ -311,3 +311,121 @@ python train_dreambooth_flax.py \
311311
--num_class_images=200 \
312312
--max_train_steps=800
313313
```
314+
315+
## Dreambooth for the inpainting model
316+
317+
318+
```bash
319+
export MODEL_NAME="runwayml/stable-diffusion-inpainting"
320+
export INSTANCE_DIR="path-to-instance-images"
321+
export OUTPUT_DIR="path-to-save-model"
322+
323+
accelerate launch train_dreambooth_inpaint.py \
324+
--pretrained_model_name_or_path=$MODEL_NAME \
325+
--instance_data_dir=$INSTANCE_DIR \
326+
--output_dir=$OUTPUT_DIR \
327+
--instance_prompt="a photo of sks dog" \
328+
--resolution=512 \
329+
--train_batch_size=1 \
330+
--gradient_accumulation_steps=1 \
331+
--learning_rate=5e-6 \
332+
--lr_scheduler="constant" \
333+
--lr_warmup_steps=0 \
334+
--max_train_steps=400
335+
```
336+
337+
The script is also compatible with prior preservation loss and gradient checkpointing
338+
339+
### Training with prior-preservation loss
340+
341+
Prior-preservation is used to avoid overfitting and language-drift. Refer to the paper to learn more about it. For prior-preservation we first generate images using the model with a class prompt and then use those during training along with our data.
342+
According to the paper, it's recommended to generate `num_epochs * num_samples` images for prior-preservation. 200-300 works well for most cases.
343+
344+
```bash
345+
export MODEL_NAME="runwayml/stable-diffusion-inpainting"
346+
export INSTANCE_DIR="path-to-instance-images"
347+
export CLASS_DIR="path-to-class-images"
348+
export OUTPUT_DIR="path-to-save-model"
349+
350+
accelerate launch train_dreambooth_inpaint.py \
351+
--pretrained_model_name_or_path=$MODEL_NAME \
352+
--instance_data_dir=$INSTANCE_DIR \
353+
--class_data_dir=$CLASS_DIR \
354+
--output_dir=$OUTPUT_DIR \
355+
--with_prior_preservation --prior_loss_weight=1.0 \
356+
--instance_prompt="a photo of sks dog" \
357+
--class_prompt="a photo of dog" \
358+
--resolution=512 \
359+
--train_batch_size=1 \
360+
--gradient_accumulation_steps=1 \
361+
--learning_rate=5e-6 \
362+
--lr_scheduler="constant" \
363+
--lr_warmup_steps=0 \
364+
--num_class_images=200 \
365+
--max_train_steps=800
366+
```
367+
368+
369+
### Training with gradient checkpointing and 8-bit optimizer:
370+
371+
With the help of gradient checkpointing and the 8-bit optimizer from bitsandbytes it's possible to run train dreambooth on a 16GB GPU.
372+
373+
To install `bitandbytes` please refer to this [readme](https://github.com/TimDettmers/bitsandbytes#requirements--installation).
374+
375+
```bash
376+
export MODEL_NAME="runwayml/stable-diffusion-inpainting"
377+
export INSTANCE_DIR="path-to-instance-images"
378+
export CLASS_DIR="path-to-class-images"
379+
export OUTPUT_DIR="path-to-save-model"
380+
381+
accelerate launch train_dreambooth_inpaint.py \
382+
--pretrained_model_name_or_path=$MODEL_NAME \
383+
--instance_data_dir=$INSTANCE_DIR \
384+
--class_data_dir=$CLASS_DIR \
385+
--output_dir=$OUTPUT_DIR \
386+
--with_prior_preservation --prior_loss_weight=1.0 \
387+
--instance_prompt="a photo of sks dog" \
388+
--class_prompt="a photo of dog" \
389+
--resolution=512 \
390+
--train_batch_size=1 \
391+
--gradient_accumulation_steps=2 --gradient_checkpointing \
392+
--use_8bit_adam \
393+
--learning_rate=5e-6 \
394+
--lr_scheduler="constant" \
395+
--lr_warmup_steps=0 \
396+
--num_class_images=200 \
397+
--max_train_steps=800
398+
```
399+
400+
### Fine-tune text encoder with the UNet.
401+
402+
The script also allows to fine-tune the `text_encoder` along with the `unet`. It's been observed experimentally that fine-tuning `text_encoder` gives much better results especially on faces.
403+
Pass the `--train_text_encoder` argument to the script to enable training `text_encoder`.
404+
405+
___Note: Training text encoder requires more memory, with this option the training won't fit on 16GB GPU. It needs at least 24GB VRAM.___
406+
407+
```bash
408+
export MODEL_NAME="runwayml/stable-diffusion-inpainting"
409+
export INSTANCE_DIR="path-to-instance-images"
410+
export CLASS_DIR="path-to-class-images"
411+
export OUTPUT_DIR="path-to-save-model"
412+
413+
accelerate launch train_dreambooth_inpaint.py \
414+
--pretrained_model_name_or_path=$MODEL_NAME \
415+
--train_text_encoder \
416+
--instance_data_dir=$INSTANCE_DIR \
417+
--class_data_dir=$CLASS_DIR \
418+
--output_dir=$OUTPUT_DIR \
419+
--with_prior_preservation --prior_loss_weight=1.0 \
420+
--instance_prompt="a photo of sks dog" \
421+
--class_prompt="a photo of dog" \
422+
--resolution=512 \
423+
--train_batch_size=1 \
424+
--use_8bit_adam \
425+
--gradient_checkpointing \
426+
--learning_rate=2e-6 \
427+
--lr_scheduler="constant" \
428+
--lr_warmup_steps=0 \
429+
--num_class_images=200 \
430+
--max_train_steps=800
431+
```

0 commit comments

Comments
 (0)