Replies: 3 comments
-
This line converts the pixel values of the image from 0-255 to 0-1, and convert the image to the tensor.
And this line converts the pixel values from 0-1 to -1 to +1. Because The same process is defined here: sd-scripts/library/train_util.py Line 115 in 2a23713 and applied here: sd-scripts/library/train_util.py Line 1135 in 2a23713 |
Beta Was this translation helpful? Give feedback.
-
Ah yeah, thanks. I tried changing the gamma of images to make them more balanced, but all it does to images with very bright backgrounds is turn the foreground into a silhouette. I think the design of modern image training networks just can't handle bright images for now, so the best thing to do is remove them from the training set. Or maybe I'll try that new 'mask' code that's you're just about to add to the scripts: to mask away the bright backgrounds. |
Beta Was this translation helpful? Give feedback.
-
I've now been doing more training runs with the following two rules in place:
I'm seeing great results from this. My runs are lasting much longer without overtraining, and the concepts in them are being picked up in more detail than before. |
Beta Was this translation helpful? Give feedback.
-
I noticed that a training run of mine was producing brighter and brighter sample images over time, and faces seemed to be losing detail too. Usually I would put that down to overtraining, but I felt it was too early in my training cycle for that.
So I looked at my input images. Many of them were photoshoot-style images, with solid white backgrounds. I deleted those ones, even though they made up half of my training set, leaving me with just 31 images, which is not particularly many. But when I restarted the training run with the same parameters as before, I actually got better results, and the high-glare look was mostly gone.
I think maybe a lot of people think their model is overtrained, when it's just the average brightness point that has drifted due to their input image set.
Looking into the topic of image brightness, I found that Huggingface's own Dreambooth training script automatically changes:
(from https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth.py )
I don't think the Kohya scripts support this? I think it might help people train higher quality models if there was a simple command line option to perform this image operation while generating the latents. What do you think?
Edit: Testing out normalizing my input dataset with Python now. The ones with bright white backgrounds now look very washed out, and there's a lot of color clamping against 0 and 1 happening. This may not be as simple and good an idea as it sounded above
I'm starting to think that normalizing the brightness by doing like Gimp does when moving the brightness centerpoint in its Levels window might make the images more suitable for the network without washing them out. It's more of a curve bend of brightness than a linear scaling.
Edit: My training run has been going for a while now, and it's picked up the washed out look. Also, the clamping of the normalized pixels to 0 and 1 (0 and 255 in byte values) is damaging the output. Lots of black eyeshadow is appearing where it shouldn't, where all the dark eyes have gotten lots of pixels clamped to black in the input images. I'll probably try a Gimp-style brightness-centerpoint-bend tomorrow, and see if that works out better.
Beta Was this translation helpful? Give feedback.
All reactions