Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Learning rate scheduler #738

Open
NicoZenith opened this issue Oct 18, 2024 · 3 comments
Open

Learning rate scheduler #738

NicoZenith opened this issue Oct 18, 2024 · 3 comments

Comments

@NicoZenith
Copy link

🚀 The feature, motivation and pitch

I don't see any option to set up a learning scheduler in the fine-tuning input arguments. Is there a way to implement it?

Alternatives

No response

Additional context

No response

@mreso
Copy link
Contributor

mreso commented Oct 28, 2024

Hi @NicoZenith currently there is only a learning rate decay scheduler implemented that you can configure through train_config.gamma. What options are you looking for?

@NicoZenith
Copy link
Author

@mreso this gamma factor is about decaying after each epoch, I'm looking for the scheduler that decays over iterations

@mreso
Copy link
Contributor

mreso commented Oct 28, 2024

@NicoZenith I see what you mean, I think it would be a great idea to provide some more flexibility with the learning rate schedule also to allow for warmup steps which we currently don't support IIRC.

My first thought was to implement it similar to custom_dataset with the option to point to a file with a function create an LRScheduler like StepLR and then a config to choose if we run step after an epoch or an iteration:

@dataclass
class LRScheduler:
   scheduler: str = "llama_recipes.utils.lr_schedulers.get_step_lr"
   # This is not good for customization.....
   step_on_epoch_end: bool = True
   step_on_iteration_end: bool = False
   def __post_init__(self):
      assert self.step_on_epoch_end != step_on_iteration_end, "Chose to either step after the epoch or after iteration ends, not both"

But then we don't have a great way to route parameters to the scheduler (like the gamma for step StepLR we have now) or to add custom parameters to a custom factory. Will try to give it some thoughts in the next days to come up with a design pattern that we can also reuse in other areas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants