-
-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roadmap #1829
Comments
with regards to
I think this might synergize really well with some form of ml leaderboard/benchmark. There are multiple datasets which can be trained with multiple models and those can themselves be trained with multiple optimisers. If the interfaces are well defined, you could just import datasets, models and optimisers and mix and match without having to change any code. This would allow us to create something like this: https://paperswithcode.com/sota/image-classification-on-mnist but instead of git repos, it would link julia packages with the models/optimisers. And you could simply install these models without having to adapt their code. At the same time all those models would act as a smoke test for changes to flux.jl. Simply install a bunch of model packages and see whether or not training still works. And since those models are created by other people and only dynamically included for testing, they would not go stale like a manually maintained model zoo. That being said, this is probably more of a long term thing and as you said: no fancy webpage neccessary. But we could keep that as a growth target in mind when coming up with solutions so that they can grow there. I looked at different optimisers in my masters thesis and just started a PhD without a clear goal yet. And I think I want to continue to look at optimisers. For that I will need to benchmark them so I will probably reimplement some models from other people. Might as well do that for you if you want. I also have some experience with CI and testing, although not as much in Julia specifically. Would love to contribute something. |
Yeah any help achieving these goals is greatly appreciated. Take a look at FluxBench.jl. It's probably a good starting point for this. Feel free to ping us on Slack or Zulip as needed as well. |
#1866 has nothing to do with bors, there were failing tests and #1866 was trying to figure out why. It ultimately turned out to be inaccuracies in Cudnn. It seems if anything bors inadvertently stopped us from merging hacks which would have been brittle.
That's part of reverse CI and benchmarking. Ths issue with FluxBench is not technical, see FluxML/FluxBench.jl#15 where I am trying to run with newer Zygote and CUDA. It is straightforward to add a script which runs these on a cron job like basis, but that clogs up the benchmark queue. I actually recently spoke with @maleadt to see if it's alright to run it at a constant cadence, and it seems like we can do it within reason. You can see the errors upgrading zygote is producing. |
What are the particular issues with Transformers? |
Great list, all those items are very relevant. I would add a couple of points:
As for the docs, it would be good to include some frequent questions and workaround from discourse, e.g. using gradients in loss function. For the pre-trained models, maybe we can leverage some of the great work @dfdx is doing on ONNX.jl and get the weights and architectures from https://github.com/onnx/models. Also being able to load weights from HuggingFace for some relevant models would be great. |
RE Hugging Face, I'd be remiss not to point out @logankilpatrick's tutorial and @chengchingwen's work on https://github.com/FluxML/HuggingFaceApi.jl :) |
@DhairyaLGandhi Regarding |
One improvement for Flux development would be integration with a comment bot so that the benchmarks can be invoked on PRs to see performance differences. This would require being able to invoke the benchmarks with user-specified versions for Flux, Zygote, NNlib, etc. |
A Stipple dashboard would be awesome! The comment bot can be invoked with @ModelZookeeper and it can go to FluxBot.jl. Its set up and works with buildkite as well. |
This issue is two years old. Can someone give an update about the current state? |
The action items are broad enough that no one of them is complete. It's mostly incremental progress using the (very limited) time we have as maintainers. For example:
The overarching theme among these updates is limited capacity. I can think of many improvements made over the past couple of years would not have happened but for the effort of one, maybe two people outside of the core team. The reality of not being a funded project (small corp, big corp, academic or otherwise) is that contributions from the community are make or break when it comes to advancing the project beyond just keeping the lights on. If you, the reader use Flux and are quite comfortable with Julia development, this is a call to action to help improve the ecosystem. Reach out to us on Slack, Zulip or Discourse in public or privately if the idea sounds interesting or if you have ideas of your own. Also, did you know we have a bi-weekly meeting on Fridays? Check it out on the community calendar. Until then, hopefully this update was helpful. |
Prompted by discussions on Discourse and Slack, I think we sorely need a roadmap issue for FluxML. This issue will serve as that roadmap; the hope is that we build it together. This roadmap is the BDFL (and
bors
).Some things are technical, some things are organizational. Feel free to suggest more tasks. If you think something should not be on the list, then suggest that too.
Governance
We don't seem to have a clear governance model. Officially, we follow ColPrac, and I think if we lean into it, it can be a sustainable model for the org. The contributing guide seems like the correct place to document this.
Technical
This isn't meant to be a comprehensive list, but it should detail our top priorities for what to work on when we aren't squashing bugs.
I explicitly started with a short list, because some of these tasks like CI need to be dealt with first before we can reliably tackle anything else.
I'd also ask that we try to be honest and constructive here. Nothing above is solved totally, so commenting what's left is more helpful than "oh that's not an issue because X does 70% of it and the rest is easy." If the rest is easy, then open a linked issue detailing what the rest is so that someone can tackle it.
Lastly, let's limit comments to Flux maintainers for the most part. Anyone is welcome to suggest stuff that we are missing of course, but it would be good to pre-start the list with comments from folks who have been working on the packages for some time.
@DhairyaLGandhi @CarloLucibello @mcabbott @ToucheSir @lorenzoh @ChrisRackauckas @logankilpatrick
The text was updated successfully, but these errors were encountered: