Skip to content

Releases: bghira/SimpleTuner

v0.9.7.4 - the curious case of the missing images

21 Jun 20:31
c4cec13
Compare
Choose a tag to compare

What's Changed

tl;dr Some square-cropped dataset shenanagins are solved, more images should be captured for large datasets / less will be excluded as too_small.

  • Added CUDA GC every x steps by @TwoAbove in #489
  • automated memory cleanup fix by @bghira in #496
  • parquet backend: mild bugfixes for cropping by @bghira in #506
  • toolkit: florence captioning script
  • bugfix: reference to args.crop which was removed several major versions ago
  • sdxl trainer: add tracking of grad_norm to wandb / tensorboard, to align with sd2.x/deepfloyd trainer
  • bugfix: account for dataset repeats when counting batches in the dataset by @bghira in #512
  • inference: fix default script accelerator device check for MPS (@TwoAbove) in #511
  • repeats: document that kohya's implementation is incorrect in the dataloader guide and kohya config converter by @bghira in #513
  • bugfix: capture more samples when their intermediary size is smaller than target. by @bghira in #515

Full Changelog: v0.9.7.3...v0.9.7.4

v0.9.7.3 - follow-up fixes & improvements

19 Jun 01:08
5dc259e
Compare
Choose a tag to compare

If you're using cropped aspect buckets you may need to recalculate them and remove the cache objects.

What's Changed

  • pixart sigma tuning by @bghira in #471
  • multigpu fixes for pixart
  • training batch size bugfix for pixart
  • repeated arg fix for pixart
  • remove debug print by @bghira in #475
  • text encoder load and unload fixes by @bghira in #477
  • remove --disable_text_encoders option as it is unused by @bghira in #478
  • Add SD3 model extension code by @AmericanPresidentJimmyCarter in #469
  • sd3: model expansion code from jimmycarter
  • pixart sigma support for training full model (no LoRA)
  • bitfit: disable by default
  • aspect bucketing fix for stretchy/improperly cropped images when using crop=true
  • misc bugfixes / QoL improvements by @bghira in #490

New Contributors

Full Changelog: v0.9.7.2...v0.9.7.3

v0.9.7.2 - pixart sigma

17 Jun 04:09
c52d7d8
Compare
Choose a tag to compare

PixArt Sigma training is now available. See the quickstart guide for information.

What's Changed

  • Fix bash script permissions/env declaration by @Beinsezii in #456
  • sd3: default weighting mechanism should be logit_normal
  • sd3 diffusion: disable rectified flow for classic DDPM training
  • text backend configuration error clarification
  • fix for shebang on unix script
  • sdxl: validations error when pipeline is None, logging some information for debugging the problem while ensuring no crash occurs by @bghira in #459
  • Updated documentation formatting as well as added on to the QUICKSTART.md by @TwoAbove in #466
  • sd3 QoL improvements
  • better error messages
  • documentation improvements, quickstart guide
  • vae cache: prevent corruption on multigpu systems
  • apple: mixed precision auto override for convenience
  • sd3: remove text encoder training from LoRA, as it doesn't work with MM-DiT by @bghira in #468

New Contributors

Full Changelog: v0.9.7.1...v0.9.7.2

v0.9.7.1

15 Jun 02:34
8e90b11
Compare
Choose a tag to compare

image
A nice robot to help you forget that I messed up some things

Breaking changes

  • The aspect bucket interval for SD3 is 64px, not 8px.
  • The intermediary size for uncropped datasets was wrong for square images.
  • --input_perturbation is gone, as it harmed models

Both of these issues will require removing:

  • The vae cache contents (.pt files)
  • The .json files from your data dir (aspect*.json)

You do not have to re-cache the text embeds.

Note: SD3 support is still experimental and it's recommended to avoid investing any large budgets into training runs until hyperparameter sweep can be conducted and ablation studies done.

What's Changed

  • sdxl regression fixes by @bghira in #444
  • sd3: clear T5 encoder out to save memory
  • model card bugfixes by @bghira in #445
  • sd3 timestep selection weighting changes by @bghira in #446
  • remove --input_perturbation code
  • fix tensor location error by @bghira in #448
  • sd3: make sure we don't use the custom sdxl vae by default (whoops)
  • nvidia: update dependencies by @bghira in #449
  • sd3 vae default fix by @bghira in #450
  • Unload LoRA weights before deleting pipeline object by @tolgacangoz in #451
  • better error message for text embed failure
  • fix for pipeline unload deleting pipeline before unloading the weights
  • sd3: use 64px interval for bucketing instead of 8px by @bghira in #454

New Contributors

Full Changelog: v0.9.7...v0.9.7.1

v0.9.7 - stable diffusion 3

13 Jun 04:56
b1d8938
Compare
Choose a tag to compare

Stable Diffusion 3

To use, set STABLE_DIFFUSION_3=true in your sdxl-env.sh and set your base model to stabilityai/stable-diffusion-3-medium-diffusers.

image

What's Changed

  • speed-up for training sample prep by @bghira in #434
  • validation on multiple gpu needs unwrap on unet by @bghira in #435
  • validations: unwrap unet when validating, in case we're using DDP (mgpu)
  • model card: estimate the number of files correctly for multigpu training
  • model card: add an auto-generated code example
  • randomised aspect bucket fix/improvement to bucket list rebuild logic
  • bugfix for rare case when removing dataset from config mid-training causes infinite loop upon resume by @bghira in #437
  • minor bugfixes for skipping unrelated errors during training by @bghira in #438
  • stable diffusion 3 support by @bghira in #436
  • sd3 lora training support by @bghira in #439
  • stable diffusion 3 & validation fixes for SDXL/others by @bghira in #440

Full Changelog: v0.9.6.3...v0.9.7

v0.9.6.3 multigpu training fixes and optimisations

09 Jun 22:39
2b6b765
Compare
Choose a tag to compare

What's Changed

MultiGPU training improvements

Thanks to Fal.ai for providing hardware to investigate and improve these areas:

  • VAE caching now reliably runs across all GPUs without missing any entries
  • Text embed caching is now parallelised safely across all GPUs
  • Other speed improvements, torch compile tests, and more

Pull requests

  • mixture-of-experts updates by @bghira in #430
  • vaecache: optimise preprocessing performance, getting more GPU utilisation
  • diffusers: update version. add workaround for sharded checkpoints, resolved in next version of the library.
  • parquet: optimise loading of captions by skipping filter
  • trainingsample: fix reference to nonexistent metadata during vae preprocessing by @bghira in #431
  • text embed cache: optimise to run across GPUs by @bghira in #432
  • vaecache: fix multigpu caching missing a huge number of relevant jobs by @bghira in #433

MultiGPU fixes made possible with hardware provided by Fal.ai

Full Changelog: v0.9.6.2...v0.9.6.3

v0.9.6.2 mixture-of-experts training

06 Jun 00:53
0e53ca5
Compare
Choose a tag to compare

What's Changed

Mixture-of-Experts

Mixture-of-Experts training complete with a brief tutorial on how to accelerate your training and start producing mind-blowing results.

image

  • DeepSpeed fix (#424)
  • Parquet backend fixes for different dataset sources
  • Parquet backend JSON / JSONL support
  • Updated check for aspect ratio mismatch to be more reliable by @bghira in #427
  • minor bugfixes for sd2.x/controlnet/sdxl refiner training by @bghira in #428
  • mixture-of-experts training via segmind models by @bghira in #429

Full Changelog: v0.9.6.1...v0.9.6.2

v0.9.6.1

30 May 21:52
11ab703
Compare
Choose a tag to compare

What's Changed

  • remove info log line by @bghira in #418
  • blip3: resume captioning an input file and only caption files that have not yet been
  • parquet backend: resolve retrieval of width/height from series columns
  • documentation: improve phrasing for more inclusivity by @bghira in #419
  • toolkit: new captioners, new captioning features for blip3
  • parquet backend: better debug logging
  • honor DELETE_PROBLEMATIC_IMAGES in the VAE cache backend when a read fails by @bghira in #420
  • multigpu fixes
  • cuda: update nvidia libs to cuda 12.1 / torch 2.3
  • validations: noise scheduler wasn't being configured by @bghira in #422
  • randomised bucketing should correct the intermediary size in a special way to ease the pain of implementation by @bghira in #423
  • debiased bucket training should rebuild cache upon epoch end (implements #416) by @bghira in #424
  • Fix retrieval of parquet captions when not using AWS backend by @bghira in #425
  • parquet backend improvements and rebuilding buckets/vae cache on each epoch for randomised bucketing by @bghira in #426

Full Changelog: v0.9.6...v0.9.6.1

v0.9.6 - debias them buckets

23 May 17:08
67dc2a8
Compare
Choose a tag to compare

debiased aspect bucketing

When training on large datasets of heterogenous samples, you will discover a content bias among aspect ratios - vertical images contain portraits, widescreen shots are cinematic, square images tend to be more artistic.

A new feature, crop_aspect=random is introduced in an attempt to combat the issue. A known issue in the implementation (#416) limits the usefulness for small datasets, but in its current state is capable of de-biasing very-large datasets.

What's Changed

  • prompt library: rewrite all prompts, focusing on concept diversity and density, reducing 'sameness' complaints of prompt library
  • logging: reduce logspam in INFO log level
  • aspect bucketing: ability to randomise aspect buckets without distorting the images (experimental)
  • validations: ability to disable uncond generation for a slight speed-up on slow hardware when not necessary
  • aspect bucketing: ability to customise the aspect resolution mappings and enforce the resolutions you wish to train on
  • captioning toolkit: new scripts for gemini-pro-vision, paligemma 3B and BLIP3
  • bugfix: dataloader metadata retrieval would occasionally return the wrong values if filenames match across multiple datasets

A majority of the changes were merged via #417

Full Changelog: v0.9.5.4...v0.9.6

v0.9.5.4 - controlnet training

17 May 19:28
d8ba270
Compare
Choose a tag to compare

What's Changed

Experimental ControlNet training support.

  • invalidate bad caches when they fail to load by @bghira in #406
  • controlnet training support (sdxl+sd2x) by @bghira in #407
  • huggingface hub: skip errors when uploading model for SD 2.x and SDXL trainers by @bghira in #410

Full Changelog: v0.9.5.3c...v0.9.5.4