-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply ChainRulesCore.jl's projection operators #153
Conversation
In general rule, like We kicked out all the An argument is that we might well want to depend on FillArrays in ChainRules.jl Did you start julia withj |
Since ChainRulesCore.jl depends on StaticArrays.jl it makes sense for it to also depend on FillArrays.jl |
It does not. |
OK. But StaticArrays.jl doesn't depend on ChainRulesCore.jl.... |
yes, because Static Arrays implements a unstructured dense array type. |
No, but the times are pretty consistent. If Revise takes a long time to think about things, that's relevant time for actual use. Trying now with
After:
Edit: that's 1.8- on an M1. Trying 1.6 on some older xeon, again with no startup file, I get:
and if I load Revise first, then these become:
|
50% longer but still <0.12 seconds. |
The other angle is that FillArrays.jl May become part of Base |
I would be keen on that. |
I think in that case it's better to put this in a new package, FillArrayChainRules.jl, for now. |
Would that not be quite confusing to users of FillArrays? If they are using an AD system that uses ChainRules but does not depend on FillArrays (ForwardDiff2, Nabla, Yota) they would have to know that they need to add a new dependency on FillArraysChainRules? Might be quite hard to figure that out from an error message? |
If it's confusing for users that is their problem. It's not a good reason to make a package bloated and hard to maintain. |
Or... have ChainRulesCore.jl depend on FillArrays.jl and put it there! It seems you don't want your package to be bloated but you are happy to bloat other peoples packages... |
Apologies if this discussion has upset you in some way, that was not the intention. We are all just trying to find the best solution to this. It might be what you suggest, it might be what we suggest, it might be neither. But the best way forward is to hear out all arguments and decide collectively based on that. Of course ultimately as a FillArrays code owner you have the right to refuse this PR outright if you think that's what best for the package. |
that's precisely what I'm saying, the best for this package and its maintenance is to not accept this PR. |
for d in 1:max(ndims(dx), length(project.axes)) | ||
size(dx, d) == length(get(project.axes, d, 1)) || throw(_projection_mismatch(axes_x, size(dx))) | ||
end | ||
Fill(mean(dx), project.axes) # Note that mean(dx::Fill) is optimised |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need another rule for the constructor to multiply the mean dx by the length of the vector? Think of x -> sum(Fill(x, 3))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the equivalent of https://github.com/FluxML/Zygote.jl/blob/e6a86745d66b5974eaafa8a8f28bcd4b100374df/src/lib/array.jl#L17
If the constructor is close to where the Fill is used, then perhaps it's a little wasteful to first project like this, and then un-create. But not so serious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
I think it is indeed completely legit that you don't want to have the code for ChainRulesCore in your package. Simultaneously, it is also legit that ChainRulesCore can't take this dependency, since we we also are super light-weight, and also are not imposing it on our 1500 indirect dependents is also too much. So the remaining solution is to have this in some package, that does type-piracy. The option of ChainRules.jl isn't great because:
So better, I think is indeed to make a separate ChainRules_FillArrays.jl (we can put it in JuliaDiff). |
Good to lay these things out. To me these reasons don't seem to obviously counteract the extra hassle of having one more package (and one more repository, possibly) to look after. It's already pretty fragmented having CR + CRC + CRTU, I mean I see why these exist but there is friction & complexity in having them separate.
|
The testing problem i mean is not for this package, but rather for some other package.
All of the big, required, changes driven by the to the testing system are now over, since was have released 1.0. Expectation is the big one; and as you say the blessing; that goes with it. We don't want to end-up with fly-by-night packages as dependencies of ChainRules.jl. And on the topic of dependencies: Further, would make ChainRules end-up blocking things from updating til many things are updated. Finally, these |
Hi, was there any decision as to what's actually going to happen? As a user who would like to use FillArrays's structured arrays as well as autodiff through those arrays, is there some way for me to actually get this together? |
I think we should:
|
I think the convention is no underscores |
Maybe this PR could be revisited and ChainRulesCore made a weak dependency? That would make the definitions available at least on Julia >= 1.9, and should not cause any additional loading or compilation time for users that do not load ChainRulesCore. |
What's a weak dependency? |
|
ChainRules now has a projection mechanism to preserve, among other things, the structure of structured arrays. This should probably apply to FillArrays. So this PR writes a few methods.
However, I'm not sure where it should live. This package depends on nothing at all besides the standard library. My vote would be for it to live in ChainRules, which is already a bigger package. (And Zygote which loads ChainRules already loads FillArrays anyway). But I open the PR here just because I wrote the code on this fork, and to discuss.
CC @oxinabox, @mzgubic
Loading times, on 1.8-, before:
And after: