-
Notifications
You must be signed in to change notification settings - Fork 26.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
minimum learning rate should be allowed to set in lr schedulers #26209
Comments
Is the following not pretty much what you are looking for: min_lr_ratio (`float`, *optional*, defaults to 0):
The final learning rate at the end of the linear decay will be `init_lr * min_lr_ratio`. |
Yes, it can work for me. This argument is in |
I ran into the same kind of problem today. I think adding that option is a good idea. |
Would one of you like to open a PR for this? 🤗 |
Yeah, let me do that! |
Wait I face the same problem, but simply changing 0.0 to min_lr_ratio will not work. This is my custom trainer, I dont know how to replicates the behaviors in papers yet, please correct me if I am wrong or if I misunderstanding your implementation.
|
Yes, you are right.
If we want the lr be reduced slowly considering all the learning steps, we can set milestone = num_training_steps - 1. This is set to default so that the behavior align with the original code. As for the cosine scheduler, I found that I can change the parameter However, this parameter cannot be reached from Trainer setting. For me, currently the workaround is to add this parameter like:
I'm thinking about add it into TrainingArguments directly. Does this implementation makes sense to you? Any suggestion is welcome. |
Just replace
|
Any update? |
你好,你的邮件已收到,谢谢!
|
你好,你的邮件已收到,谢谢!
|
Feature request
In current lr schedulers provided in
optimization.py
, the minimum learning rate is always0.0
.We could add one more input parameter like "min_lr" to let user defind the minimum learning rate.
Take
_get_linear_schedule_with_warmup_lr_lambda
as an example:Original:
We can change it into:
Motivation
In some papers, they mentioned about their lr scheduling. Take LIMA as an example:
To reproduce the experiment using their recipe, I need to rewrite the scheduler (and the all related functions like get_scheduler/create_scheduler) in the trainer, that makes the code really ugly.
So I think it might be good to have this kind of feature to make trainer more flexible.
Your contribution
I can submit a PR for this feature.
The text was updated successfully, but these errors were encountered: