Releases: bghira/SimpleTuner
v0.9.7.4 - the curious case of the missing images
What's Changed
tl;dr Some square-cropped dataset shenanagins are solved, more images should be captured for large datasets / less will be excluded as too_small
.
- Added CUDA GC every x steps by @TwoAbove in #489
- automated memory cleanup fix by @bghira in #496
- parquet backend: mild bugfixes for cropping by @bghira in #506
- toolkit: florence captioning script
- bugfix: reference to args.crop which was removed several major versions ago
- sdxl trainer: add tracking of grad_norm to wandb / tensorboard, to align with sd2.x/deepfloyd trainer
- bugfix: account for dataset repeats when counting batches in the dataset by @bghira in #512
- inference: fix default script accelerator device check for MPS (@TwoAbove) in #511
- repeats: document that kohya's implementation is incorrect in the dataloader guide and kohya config converter by @bghira in #513
- bugfix: capture more samples when their intermediary size is smaller than target. by @bghira in #515
Full Changelog: v0.9.7.3...v0.9.7.4
v0.9.7.3 - follow-up fixes & improvements
If you're using cropped aspect buckets you may need to recalculate them and remove the cache objects.
What's Changed
- pixart sigma tuning by @bghira in #471
- multigpu fixes for pixart
- training batch size bugfix for pixart
- repeated arg fix for pixart
- remove debug print by @bghira in #475
- text encoder load and unload fixes by @bghira in #477
- remove --disable_text_encoders option as it is unused by @bghira in #478
- Add SD3 model extension code by @AmericanPresidentJimmyCarter in #469
- sd3: model expansion code from jimmycarter
- pixart sigma support for training full model (no LoRA)
- bitfit: disable by default
- aspect bucketing fix for stretchy/improperly cropped images when using crop=true
- misc bugfixes / QoL improvements by @bghira in #490
New Contributors
- @AmericanPresidentJimmyCarter made their first contribution in #469
Full Changelog: v0.9.7.2...v0.9.7.3
v0.9.7.2 - pixart sigma
PixArt Sigma training is now available. See the quickstart guide for information.
What's Changed
- Fix bash script permissions/env declaration by @Beinsezii in #456
- sd3: default weighting mechanism should be logit_normal
- sd3 diffusion: disable rectified flow for classic DDPM training
- text backend configuration error clarification
- fix for shebang on unix script
- sdxl: validations error when pipeline is None, logging some information for debugging the problem while ensuring no crash occurs by @bghira in #459
- Updated documentation formatting as well as added on to the QUICKSTART.md by @TwoAbove in #466
- sd3 QoL improvements
- better error messages
- documentation improvements, quickstart guide
- vae cache: prevent corruption on multigpu systems
- apple: mixed precision auto override for convenience
- sd3: remove text encoder training from LoRA, as it doesn't work with MM-DiT by @bghira in #468
New Contributors
Full Changelog: v0.9.7.1...v0.9.7.2
v0.9.7.1
A nice robot to help you forget that I messed up some things
Breaking changes
- The aspect bucket interval for SD3 is 64px, not 8px.
- The intermediary size for uncropped datasets was wrong for square images.
--input_perturbation
is gone, as it harmed models
Both of these issues will require removing:
- The vae cache contents (.pt files)
- The
.json
files from your data dir (aspect*.json
)
You do not have to re-cache the text embeds.
Note: SD3 support is still experimental and it's recommended to avoid investing any large budgets into training runs until hyperparameter sweep can be conducted and ablation studies done.
What's Changed
- sdxl regression fixes by @bghira in #444
- sd3: clear T5 encoder out to save memory
- model card bugfixes by @bghira in #445
- sd3 timestep selection weighting changes by @bghira in #446
- remove
--input_perturbation
code - fix tensor location error by @bghira in #448
- sd3: make sure we don't use the custom sdxl vae by default (whoops)
- nvidia: update dependencies by @bghira in #449
- sd3 vae default fix by @bghira in #450
- Unload LoRA weights before deleting pipeline object by @tolgacangoz in #451
- better error message for text embed failure
- fix for pipeline unload deleting pipeline before unloading the weights
- sd3: use 64px interval for bucketing instead of 8px by @bghira in #454
New Contributors
- @tolgacangoz made their first contribution in #451
Full Changelog: v0.9.7...v0.9.7.1
v0.9.7 - stable diffusion 3
Stable Diffusion 3
To use, set STABLE_DIFFUSION_3=true
in your sdxl-env.sh
and set your base model to stabilityai/stable-diffusion-3-medium-diffusers
.
What's Changed
- speed-up for training sample prep by @bghira in #434
- validation on multiple gpu needs unwrap on unet by @bghira in #435
- validations: unwrap unet when validating, in case we're using DDP (mgpu)
- model card: estimate the number of files correctly for multigpu training
- model card: add an auto-generated code example
- randomised aspect bucket fix/improvement to bucket list rebuild logic
- bugfix for rare case when removing dataset from config mid-training causes infinite loop upon resume by @bghira in #437
- minor bugfixes for skipping unrelated errors during training by @bghira in #438
- stable diffusion 3 support by @bghira in #436
- sd3 lora training support by @bghira in #439
- stable diffusion 3 & validation fixes for SDXL/others by @bghira in #440
Full Changelog: v0.9.6.3...v0.9.7
v0.9.6.3 multigpu training fixes and optimisations
What's Changed
MultiGPU training improvements
Thanks to Fal.ai for providing hardware to investigate and improve these areas:
- VAE caching now reliably runs across all GPUs without missing any entries
- Text embed caching is now parallelised safely across all GPUs
- Other speed improvements, torch compile tests, and more
Pull requests
- mixture-of-experts updates by @bghira in #430
- vaecache: optimise preprocessing performance, getting more GPU utilisation
- diffusers: update version. add workaround for sharded checkpoints, resolved in next version of the library.
- parquet: optimise loading of captions by skipping filter
- trainingsample: fix reference to nonexistent metadata during vae preprocessing by @bghira in #431
- text embed cache: optimise to run across GPUs by @bghira in #432
- vaecache: fix multigpu caching missing a huge number of relevant jobs by @bghira in #433
MultiGPU fixes made possible with hardware provided by Fal.ai
Full Changelog: v0.9.6.2...v0.9.6.3
v0.9.6.2 mixture-of-experts training
What's Changed
Mixture-of-Experts
Mixture-of-Experts training complete with a brief tutorial on how to accelerate your training and start producing mind-blowing results.
- DeepSpeed fix (#424)
- Parquet backend fixes for different dataset sources
- Parquet backend JSON / JSONL support
- Updated check for aspect ratio mismatch to be more reliable by @bghira in #427
- minor bugfixes for sd2.x/controlnet/sdxl refiner training by @bghira in #428
- mixture-of-experts training via segmind models by @bghira in #429
Full Changelog: v0.9.6.1...v0.9.6.2
v0.9.6.1
What's Changed
- remove info log line by @bghira in #418
- blip3: resume captioning an input file and only caption files that have not yet been
- parquet backend: resolve retrieval of width/height from series columns
- documentation: improve phrasing for more inclusivity by @bghira in #419
- toolkit: new captioners, new captioning features for blip3
- parquet backend: better debug logging
- honor DELETE_PROBLEMATIC_IMAGES in the VAE cache backend when a read fails by @bghira in #420
- multigpu fixes
- cuda: update nvidia libs to cuda 12.1 / torch 2.3
- validations: noise scheduler wasn't being configured by @bghira in #422
- randomised bucketing should correct the intermediary size in a special way to ease the pain of implementation by @bghira in #423
- debiased bucket training should rebuild cache upon epoch end (implements #416) by @bghira in #424
- Fix retrieval of parquet captions when not using AWS backend by @bghira in #425
- parquet backend improvements and rebuilding buckets/vae cache on each epoch for randomised bucketing by @bghira in #426
Full Changelog: v0.9.6...v0.9.6.1
v0.9.6 - debias them buckets
debiased aspect bucketing
When training on large datasets of heterogenous samples, you will discover a content bias among aspect ratios - vertical images contain portraits, widescreen shots are cinematic, square images tend to be more artistic.
A new feature, crop_aspect=random
is introduced in an attempt to combat the issue. A known issue in the implementation (#416) limits the usefulness for small datasets, but in its current state is capable of de-biasing very-large datasets.
What's Changed
- prompt library: rewrite all prompts, focusing on concept diversity and density, reducing 'sameness' complaints of prompt library
- logging: reduce logspam in
INFO
log level - aspect bucketing: ability to randomise aspect buckets without distorting the images (experimental)
- validations: ability to disable uncond generation for a slight speed-up on slow hardware when not necessary
- aspect bucketing: ability to customise the aspect resolution mappings and enforce the resolutions you wish to train on
- captioning toolkit: new scripts for gemini-pro-vision, paligemma 3B and BLIP3
- bugfix: dataloader metadata retrieval would occasionally return the wrong values if filenames match across multiple datasets
A majority of the changes were merged via #417
Full Changelog: v0.9.5.4...v0.9.6
v0.9.5.4 - controlnet training
What's Changed
Experimental ControlNet training support.
- invalidate bad caches when they fail to load by @bghira in #406
- controlnet training support (sdxl+sd2x) by @bghira in #407
- huggingface hub: skip errors when uploading model for SD 2.x and SDXL trainers by @bghira in #410
Full Changelog: v0.9.5.3c...v0.9.5.4