New feature: Export images #332

StableLlama · 2025-02-06T23:49:40Z

This PR adds the ability to export the images to a new folder. During this export the images can be resized to suit the target model size and cropped to fit into the target buckets. After the resize a slight sharpening is happening as it is recommended by graphic specialists.

Notable features:

Resizing is done with the highest quality algorithm
Best target dimensions are thoroughly searched
Color space conversion can be applied as source images might use different color spaces but the training tools usually don't know anything about color spaces
A statistical preview is shown

Additionally code changes:

The use of the DEFAULT_SETTINGS mechanism was simplified to reduce the necessary coding to prevent inconsistency bugs and improve readability

Still missing: - color space handling - JPEG XL (different PR) - crop editor (different PR)

Code refactor to respect mostly a width of 80

jhc13 · 2025-02-07T02:16:48Z

I appreciate your work, but I think this might be outside the scope of the program. TagGUI is designed for creating and editing text captions for images without modifying the images themselves. I think the features you are implementing are better suited for a dedicated program for processing the images.

By the way, TagGUI already supports moving or copying images to a different directory, but these actions do not alter the images.

StableLlama · 2025-02-07T10:28:35Z

This PR is the base for the next two steps I want to include as well:

support manual cropping
- Give the user a hint to catch exactly relevant aspect ratios (due to the bucketing it's not an obvious task)
- Give the user a hint about what will be additionally cropped to fit a bucket to be able to fine adjust the crop
- Possibly: allow multiple crops per image, like having a full body crop and a face crop out of a high res source image.
support masking
- Have one or more positive (rectangular) masks
- Have one or more negative (rectangular) masks
- When exporting create the real mask from the union/intersection of these masks and either store it in the alpha channel of the exported image or store it in a parallel directory, just like the usual training scripts are expecting them
- Likely: show what the mask will exactly cover or not as the quantization due to the VAE / latent space might surprise the unaware user
- Probably: have (rectangular) "hints" that show major features ("head", "hand"). Likely these can then help to quickly create a mask or crop, e.g. to prevent cropping though a hand which will create bad training results
- Likely: The data storage for these features will be so simple that external tools can easily create them to allow a quick semiautomatic workflow. E.g. have a watermark detector tool to create a negative mask for it. Have a hand and face detector tool to create hints. The user can then quickly create the crop is such a way that all these information is taken care of. (Of course this functionality could be added to taggui as well, when that is desired and someone knowledgeable of YOLO models or similar can create the code)
- Out of scope: a "real" mask editor where you can paint a mask.

All of this is the next step in the workflow of starting with a bunch of images and finally coming up with a ready set of training images. So far I know of no other tool that can do this, I know people who are missing a tool for this (not only me :)), it is closely related to tagging the images and taggui is the perfect base for this extended functionality.
So I think it is in the scope of taggui. Or, to be more precise: it is a valuable extension of the scope in an area where no tools are available.

And all of this is in the (highly appreciated!) spirit of not modifying the source images. That's even the reason for creating the exporting function as it allows you to reuse your valuable data set for new models when they become available:
You are using SD1.5 right now? Tag the images and export them with 512px.
Now you are switching to SDXL or Flux? Only thing to do is to do a new export with 1024px.
Then the next model comes that does UHD 4k? Still, only one more export, this time with 4k. (You can do it with the code from this PR already, even when it doesn't know about 4k)
Or a future model that does HDR and a wide color gamut? And your image collection already has such images? Then it's again only one export, just select the color space accordingly. (And all exports prior to this were even working correctly for you as it changed the color space to sRGB as that is what the current models are expecting. If you would have given those models these HDR images by plain copying the training result could have been disappointing)

StableLlama · 2025-02-07T12:24:00Z

Hint about HDR, the Open Model Initiative seems to look actively at supporting HDR images: https://github.com/Open-Model-Initiative/HDR_SDXL

Advantages are smaller file sizes when compression is acceptable (quality < 100) or lossless compression with quality = 100. Also the alpha channel is supported and kept in the images. Note 1: this only adds support for the export function, and is not general JPEG XL support for taggui. There is the fork https://github.com/yggdrasil75/taggui that does exactly this. Note 2: You might need `pip install pillow-jxl-plugin` beforehand to be able to export into JPEG XL.

StableLlama · 2025-02-12T21:45:19Z

@yggdrasil75 I've seen that you are working on real JPEG XL support. Do you intend to make a pull request out of it soon, as it would work very well with my latest commit to this PR (-> #337 )?

yggdrasil75 · 2025-02-13T01:40:36Z

I will once I stop being lazy and actually have it working. might be tomorrow now that someone else mentioned it.

yggdrasil75 · 2025-02-13T15:45:32Z

made the jxl pr. btw, the biggest benefit I would see with this would be adding segm/bbox support to auto mask non-character portions, or auto crop to character, is that the eventual plan?

StableLlama · 2025-02-13T17:57:29Z

That's exactly where I'm heading at.
But before the automation can come (when I actually have no experience at) the manual editing interface must work as you'll need that for refinement anyway.
And once that is here you can easily add automation for it.

Also refactor the code a bit to be able to use the target size calculation more easily in (future) other modules as well.

StableLlama · 2025-02-15T10:42:41Z

Now I've also added a few more refinements, so I will not change this branch / RP anymore (unless someone finds bugs that need fixing, of course).
For the next step of the exporting (i.e. the crop editor) I'll create a new PR that will require this PR as a basis.

So please pull it, as well as #335, as both form the basis.
Also #337 would fit very well here.

StableLlama added 11 commits February 3, 2025 19:47

Implement export functionality

ed3a1fc

Still missing: - color space handling - JPEG XL (different PR) - crop editor (different PR)

Add color space conversion

356441b

Little display fixes and add infrastructure for preferred sizes.

98488e3

Change algorithm for bucketing

19bdd7d

Code documentation

8d7d11f

Add documentation

148a61d

Fix markdown style

bc0ba7d

Make sure to export the caption files as well

2362433

Refactor DEFAULT_SETTINGS to ease initial value access

1f46aca

Code refactor to respect mostly a width of 80

Use new default infrastructure

54cdd31

Fix broken tagging

4282412

Finetune sharpening

c0b4855

StableLlama marked this pull request as ready for review February 7, 2025 09:43

StableLlama added 5 commits February 14, 2025 00:04

Allow export to respect filter and selection of images

b853b2d

Add possibility to filter for image size

707cf05

Add filtering for target size.

85e5bd5

Also refactor the code a bit to be able to use the target size calculation more easily in (future) other modules as well.

Remove little left over

b3a9981

Add option to only export missing images

1ea4dcd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New feature: Export images #332

New feature: Export images #332

StableLlama commented Feb 6, 2025

jhc13 commented Feb 7, 2025

StableLlama commented Feb 7, 2025

StableLlama commented Feb 7, 2025

StableLlama commented Feb 12, 2025 •

edited

Loading

yggdrasil75 commented Feb 13, 2025

yggdrasil75 commented Feb 13, 2025

StableLlama commented Feb 13, 2025

StableLlama commented Feb 15, 2025 •

edited

Loading

New feature: Export images #332

Are you sure you want to change the base?

New feature: Export images #332

Conversation

StableLlama commented Feb 6, 2025

jhc13 commented Feb 7, 2025

StableLlama commented Feb 7, 2025

StableLlama commented Feb 7, 2025

StableLlama commented Feb 12, 2025 • edited Loading

yggdrasil75 commented Feb 13, 2025

yggdrasil75 commented Feb 13, 2025

StableLlama commented Feb 13, 2025

StableLlama commented Feb 15, 2025 • edited Loading

StableLlama commented Feb 12, 2025 •

edited

Loading

StableLlama commented Feb 15, 2025 •

edited

Loading