Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add New lr scheduler #1393

Merged
merged 8 commits into from
Sep 11, 2024
Merged

Add New lr scheduler #1393

merged 8 commits into from
Sep 11, 2024

Conversation

sdbds
Copy link
Contributor

@sdbds sdbds commented Jun 28, 2024

Change Lr schedulers library from diffusers to transformers, they use same library but transformers had more lr schedulers.
Considering the community has always wanted custom learning rate, I think these newly added regulators will meet the demand.

Add inverse sqrt learning rate scheduler
image
new argument lr_scheduler_timescale,default to warms_up_steps

Add cosine with min lr scheduler
When set the num_steps=100, num_warmup_steps=10, lr=0.2, min_lr=0.01. The learning rate looks like:
image

new argument lr_scheduler_min_lr_ratio,default to 0 as cosine lr scheduler.

Add WSD scheduler
The ladder scheduler that so many people have been wanting.
image

new argument lr_decay_steps,
new argument lr_scheduler_min_lr_ratio,default to 0.

need update requirement transformers==4.41.2

@sdbds
Copy link
Contributor Author

sdbds commented Aug 30, 2024

Fix bugs now, i think it works well.
@kohya-ss

@kohya-ss
Copy link
Owner

kohya-ss commented Sep 1, 2024

Thank you! However, removing PIECEWISE_CONSTANT will likely impact users. I would like to update the code after merging so that it can be used, but it will take some time. I'd like to prioritize some other PRs. If you could fix this, that would be great.

@sdbds
Copy link
Contributor Author

sdbds commented Sep 1, 2024

Thank you! However, removing PIECEWISE_CONSTANT will likely impact users. I would like to update the code after merging so that it can be used, but it will take some time. I'd like to prioritize some other PRs. If you could fix this, that would be great.

OK,just keep diffusers import for backup using, use

name = SchedulerType(name) or DiffusersSchedulerType(name)
schedule_func = TYPE_TO_SCHEDULER_FUNCTION[name] or DIFFUSERS_TYPE_TO_SCHEDULER_FUNCTION[name]

to judge if using PIECEWISE_CONSTANT.
And i add parser type for input float with warmup and decay ratio, don't need to calculate training total steps.

@kohya-ss
Copy link
Owner

kohya-ss commented Sep 9, 2024

Sorry for bothering you again. Is there a reason to upgrade the version of the library other than transformers? It requires some more comprehensive testing, which takes time.

@sdbds
Copy link
Contributor Author

sdbds commented Sep 9, 2024

Sorry for bothering you again. Is there a reason to upgrade the version of the library other than transformers? It requires some more comprehensive testing, which takes time.

The main thing is to upgrade the transformers version, and after upgrading the transformers version the other two dependencies will also ask for an update, so it's three version updates.
Since the PR is older, I reviewed the latest three dependency update histories, and there don't seem to be any major bug fixes as well as disruptive changes.

@kohya-ss
Copy link
Owner

kohya-ss commented Sep 9, 2024

Thanks, I understand. Then it seems like there is no big problem. I've updated accelerate and transformers in the sd3 branch, so maybe I can match that. I'll do some checks and merge :)

@kohya-ss kohya-ss merged commit fd68703 into kohya-ss:dev Sep 11, 2024
1 check passed
@kohya-ss
Copy link
Owner

Sorry for the delay, I have merged this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants