-
Notifications
You must be signed in to change notification settings - Fork 880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add detail dataset config feature by extra config file #227
Conversation
* bucket_repo_range * shuffle_keep_tokens
We need to provide However, this error message seems weird, right? This comes from error message of voluptuous only shows the first one when there are multiple errors. In this case, it firstly tries to parse config as DreamBooth dataset. This time it causes extra key error because DreamBooth dataset does not support "metadata_file" option. After that, it tries to parse config as fine tuning dataset. This time it causes required key error because there is no "image_dir" option. I think adding another required config like |
Thank you for details, I understand the behavior of voluptuous.
I agree that. I forgot to add
I think it might be good idea. I also wonder that |
I'm afraid that simply swapping parsing order still causes the same issue because DreamBooth dataset also has distinct options, "is_reg" and "caption_extension". However, checking whether |
Ah, I have found I think |
Thank you for updating! Now I think we are about ready to release it :) I would like to confirm one thing, is my understanding correct that we cannot mix subsets of DreamBooth and fine tuning as a subsets of a certain dataset? |
Update:
You are right. These two subset types cannot be mixed into single dataset. This is mainly because I have no idea how to compensate the number of regularization images when there are also fine tuning subsets. If you come up with some way, these different types of subsets might be able to mix together. I have added description about this topic to README. |
Thank you for updating! The code and README are quite good!
Thank you for the clarification. I got it. I think the number of regularization images for the dataset might be the sum of all non-regularization subsets. Because, for example, if someone wants to train a particular character with images with captions, along with a regularization images, it is preferable for the person to be able to use metadata as well as captioned DreamBooth subset. I will merge the PR after work today 😀 |
I've finally released the feature! I've changed the name of the option to Thank you again for this great contribution! |
README: https://github.com/fur0ut0/sd-yascripts/blob/feature/dataset_config/config_README-ja.md
Solve #58 #130
There are several major changes:
--config_file
optionDatasetGroup
and use it as pseudo dataset--bucket_shuffle_across_dataset
I have conserved backward compatibility of current DreamBooth directory handling, which uses subdirectory structure for class tokens and the number of dataset repeat.
NOTE: Sorry for a bit messy history. Maybe squash merge would be better option.