Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid __init__ to reduce load time #340

Closed
KristofferC opened this issue May 2, 2021 · 7 comments · Fixed by #356
Closed

Avoid __init__ to reduce load time #340

KristofferC opened this issue May 2, 2021 · 7 comments · Fixed by #356
Milestone

Comments

@KristofferC
Copy link

KristofferC commented May 2, 2021

ChainRulesCore causes significant load time regressions in packages that depend on it (e.g. JuliaMath/SpecialFunctions.jl#310). A significant amount of that is contributed to the __init__ function that unconditionally has to be run when the package is loaded since this has to be inferred and compiled.

Removing the __init__ reduces approximately cuts the load time of the package to a factor of 1/3. If this package is supposed to be dependent on by such core packages as SpecialFunctions it would be good if its goals could be achieved with standard dispatch which the package loading system is optimized for, instead of running arbitrary code at load time.

@oxinabox
Copy link
Member

oxinabox commented May 2, 2021

Hmm, good to know.

In the short term we can probably push this code down into either the init for the AD systems that need it, or perhaps into ChainRules.jl.
Since it's not needed for defining rules, just for using them to generate code automatically.

Which we only have to allow ADs to do because it let's them generate code for rule overloads that use generic dispatch.
Which they need because Cassette etc keeps breaking type inference.
And since they need to generate code in response to rules being defined they need a way to know if a package has defined rules.

Do you know how much of the load time is from having and __init__ at all vs from the stuff the __init__ is doing?

simeonschaub added a commit that referenced this issue May 2, 2021
With this PR:
```
(ChainRulesCore) pkg> precompile
Precompiling project...
  1 dependency successfully precompiled in 2 seconds (1 already precompiled)

julia> @time using ChainRulesCore
  0.082255 seconds (157.13 k allocations: 9.737 MiB, 5.59% compilation time)
```

Before:
```
julia> @time using ChainRulesCore
  0.111551 seconds (443.47 k allocations: 25.618 MiB, 3.87% compilation time)
```

Removing the hook alltogether:
```
julia> @time using ChainRulesCore
  0.033357 seconds (36.37 k allocations: 2.399 MiB, 17.75% compilation time)
```

That's still more overhead than I'd like, so we should still discuss
whether we could avoid these hooks, but it does at least improve the current
situation somewhat.

Ref #340
simeonschaub added a commit that referenced this issue May 2, 2021
With this PR:
```
(ChainRulesCore) pkg> precompile
Precompiling project...
  1 dependency successfully precompiled in 2 seconds (1 already precompiled)

julia> @time using ChainRulesCore
  0.082255 seconds (157.13 k allocations: 9.737 MiB, 5.59% compilation time)
```

Before:
```
julia> @time using ChainRulesCore
  0.111551 seconds (443.47 k allocations: 25.618 MiB, 3.87% compilation time)
```

Removing the hook alltogether:
```
julia> @time using ChainRulesCore
  0.033357 seconds (36.37 k allocations: 2.399 MiB, 17.75% compilation time)
```

That's still more overhead than I'd like, so we should still discuss
whether we could avoid these hooks, but it does at least improve the current
situation somewhat.

Ref #340
@oxinabox
Copy link
Member

oxinabox commented May 2, 2021

Removing this might be argued to be breaking.
But AFAIK noone is using this feature just yet, apparently from an unmerged branch of Nabla.jl which can be updated before merging.
ForwardDiff2 wanted this feature (they sufferred badly from Cassette breaking inference) but is on hiatus and so it isn't using it.

So I think we can remove it as nonbreaking, and revert if someone says it breaks their code.
Ideally we would do so while adding/documenting a new way to make sure rule refresh gets triggered after loading any package that defines rules.
But perhaps that can wait and come in a follow up PR (perhaps when I go back to that Nabla one, which I am due to back back on in 4 weeks or so).

How much of a rush are you for this?

simeonschaub added a commit that referenced this issue May 2, 2021
With this PR:
```
(ChainRulesCore) pkg> precompile
Precompiling project...
  1 dependency successfully precompiled in 2 seconds (1 already precompiled)

julia> @time using ChainRulesCore
  0.082255 seconds (157.13 k allocations: 9.737 MiB, 5.59% compilation time)
```

Before:
```
julia> @time using ChainRulesCore
  0.111551 seconds (443.47 k allocations: 25.618 MiB, 3.87% compilation time)
```

Removing the hook alltogether:
```
julia> @time using ChainRulesCore
  0.033357 seconds (36.37 k allocations: 2.399 MiB, 17.75% compilation time)
```

That's still more overhead than I'd like, so we should still discuss
whether we could avoid these hooks, but it does at least improve the current
situation somewhat.

Ref #340
@simeonschaub
Copy link
Member

Thinking a bit more about this, is there any reason these hooks need to be added by ChainRulesCore? Perhaps we can move these hooks to ChainRules or even its own small package, so they are only added, when they are actually required by an AD.

@KristofferC
Copy link
Author

How much of a rush are you for this?

Very low :P, I've just started measuring some package load times based on some complaints and now filing issues / PRs wherever it seems reasonable.

@oxinabox
Copy link
Member

oxinabox commented May 2, 2021

Thinking a bit more about this, is there any reason these hooks need to be added by ChainRulesCore? Perhaps we can move these hooks to ChainRules or even its own small package, so they are only added, when they are actually required by an AD.

We can almost definitely remove the attaching a package load hook in the init.
We might be able to move 100% of the code generation hook stuff out.
I would need to double check.
I am loath to add a 5th package to the family, but might be best.

Edit:
I am coming round to a seperate package for all the code generation hooks.
It would mean we could tag 1.0 for ChainRulesCore while leaving that less mature code generation stuff still 0.x

@stevengj
Copy link

stevengj commented May 4, 2021

Yes, please push as much as possible into a separate package. The only things in ChainRulesCore, in my mind, should be the functions necessary to define new differentiation rules — it should as lightweight as possible to make a no-brainer to add differentiability to any package.

@oxinabox oxinabox added this to the v0.10 milestone May 13, 2021
@oxinabox
Copy link
Member

I have created https://github.com/JuliaDiff/ChainRulesOverloadGeneration.jl to push out the overload generation to
https://github.com/JuliaDiff/ChainRulesOverloadGeneration.jl

(Right now it just has a slice out of the history of this repo pruned down to just contain the relevent files, but I will fix up names and make it work after)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants