Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to Optimisers.jl and add schedules #1487

Closed
wants to merge 1 commit into from

Conversation

darsnack
Copy link
Member

@darsnack darsnack commented Jan 29, 2021

This PR is a WIP to update Flux to reflect FluxML/Optimisers.jl#9. This is similar to #1481 but it excludes some orthogonal changes to train!. It also contains scheduling policies which is missing from #1481, and it uses a more complete version of Optimisers.jl.

Improvements/changes:

  • Optimisers are now in Optimisers.jl
  • Scheduling policies from ParameterSchedulers.jl
  • Optimiser is now called Sequence

This is still not at a state ready to merge. I need to sort out testing, documentation, and gradient clipping. Does anyone know if you can cross-reference docstrings from other packages inside a package? Ideally, we would write the docstrings in Optimisers.jl and ParameterSchedulers.jl

Note that schedulers compose easily with optimizers:

opt = Momentum()
sched = Schedule.Cos(λ0 = 0, λ1 = opt.eta, period = 20)
sched_opt = Scheduler(sched, opt)
# use sched_opt anywhere you use opt
Flux.train!(loss, params(m), data, sched_opt)

PR Checklist

  • Tests are added
  • Entry in NEWS.md
  • Documentation, if applicable

cc @DhairyaLGandhi @CarloLucibello @mcabbott @ToucheSir

@DhairyaLGandhi
Copy link
Member

DhairyaLGandhi commented Jan 29, 2021

Hmm, I'm not keen to add dependencies to ParameterSchedulers to Flux. Further, I don't think #1481 is finished yet, so would rather have changes be recommended there if possible.

Easier to track progress rather than have competing PRs which doesn't seem to belie the underlying changes.

@darsnack
Copy link
Member Author

I don't mind closing one PR for another. I just opened this to be able to show the code cause I don't have write access to your fork. That being said what's the issue with ParameterSchedulers? It has no dependencies and adheres to existing interfaces in Base.

@DhairyaLGandhi
Copy link
Member

Yeah, best to move discussion related to optimisers.jl there.

There's nothing wrong with it, but it deviates slightly from the design choices and assumptions we had made previously with Flux, and would rather vet that so we have a clear picture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants