-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ArrayInterfaceCore #211
Comments
We could take one or both of the following approaches:
Taking both would mean creating two libraries: one of the stubs, and then one of the basic functionality. |
@chriselrod your approaches sounds like (for shorthand):
I think there's a more functional divide. It seems to me like there are two kinds of users:
Basically, that's all to say I don't think Finally, I'm curious how this ties into the long-mentioned merge into base julia. |
For me, basic functionality are the things required for LoopVectorization to work on a custom array type. |
What's the measure here? Load time? |
I think it should be; anything like an ArrayInterfaceCore should take on the order of a tenth of a second to load, I think. Is there a breakdown of load time by file? What precisely makes ArrayInterface take so long to load? |
Looks like it hits that mark? julia> @time using ArrayInterface
0.098646 seconds (172.08 k allocations: 9.701 MiB, 40.21% compilation time) Though it's close 😅 . Likely influenced by the |
Are you sure you're in a clean environment, and also that this environment isn't located on a supercomputer?
|
Julia v1.7? |
... How?!
|
The problem with splitting things up is that the point is that everyone uses the same interface. I know that there's a lot of code related to the indexing pipeline and |
It might be worth splitting that code into a different package, both to reduce compile times (which are a bit on the iffy side, although definitely not a huge deal after 1.7) and to improve the code's visibility -- if the indexing features mostly aren't being used, that might be because people aren't aware of them. |
I think the appropriate solution for this is documentation (which the lack of is admittedly my fault). Nonetheless, I've drafted a PR that gets rid of the indexing code so people can begin to look at how this affects things. |
I’d definitely agree that’s not sufficient reason to separate this code out — it’s more of an incidental benefit you might get. The main hope would be splitting the package would reduce the size of the dependency without causing many problems. |
We could probably move |
I'm confused. Splitting to different packages and making use of Requires would only increase compile and startup times, not decrease it. I'm not sure why that's being proposed as the solution then to decreasing startup time. |
Why would we need Requires.jl? Most packages don't want to use ArrayInterface.jl. So put the methods that need to be extended to support ArrayInterface into ArrayInterfaceCore. This could reduce the use of Requires, if libraries like StaticArrays.jl are then willing to take ArrayInterfaceCore as a dependency, letting us drop the associated |
I don't think anyone has actually made a serious proposal to have StaticArrays.jl use ArrayInterface.jl. Perhaps there would be push back related to this issue but a lot of code in that package could be replaced by what we have here anyway. |
What parts are being proposed to move out and what is the v1.7 |
Im not sure its worth doing anything about this given the 1.7 performance improvements. Speaking to my small part in triggering this discussion, 0.1 seconds is not an issue at all for DimensionalData.jl. It will be swallowed by the improvements to load times of DD itself in 1.7. |
A library of stubs should load nearly instantly.
Making a bunch of internal changes costs developer time and risks breaking things, while leaving them as they are does not. If someone has the time, they could make a PR, although I'd recommend asking the StaticArrays maintainers first to see if they're likely to accept such a PR. |
I agree. Which things are making the loading slow, and why? I don't see anything clear here other than "yay smaller". Which functions need to move, and will it send it down 10%? Or will removing just one function reduce the using time by 99.9%? Could we just delete that function? Etc. |
I opened #212 to start showing what we could remove. There's a bit more that could be cut but most other changes would be clean up we should do anyway. |
How much using time would it cut? Can we make it a sub-package? |
Here's the current one for me julia> @time using ArrayInterface
1.419789 seconds (4.16 M allocations: 203.910 MiB, 11.25% gc time) and that PR... julia> @time using ArrayInterface
[ Info: Precompiling ArrayInterface [4fba245c-0d91-5ea0-9b3e-6abc04ee57a9]
1.339895 seconds (1.56 M allocations: 91.977 MiB, 0.70% gc time, 32.98% compilation time) Other people should probably benchmark it with more robust setups for benchmarking.
We could but I just don't really see the point. The majority of code that would stick around after cleanup is related
Initial changes shouldn't be that bad. For example, |
Time to post precompile both times. |
My suggestion with ArrayInterfaceCore was that it would just be stubs, e.g. function parent_type end that |
Wouldn't Sorry, I'm not trying to be difficult. Maybe an initial PR the proposed package as a submodule or perhaps a gist would help the conversation. |
When ArrayInterface is loaded. |
yes, that would also give explicit versioning on the requires'd pieces. |
Gotcha. I'll try looking into it in a couple days. |
This moves all of the non-`@require` code into an internal package. I managed to get initial tests to run for `ArrayInterfaceCore` but there will be several more combs through the tests to make sure we can separate out tests that are independent of downstream packages. This required addition of a simple proxy for `MArray` so we could test all of the static/known traits. Zero work has been done to actually test the ArrayInterace test at the top level here so far, but hopefully it will be fairly straightforward if there is an exhaustive list following `impor ArrayInterfaceCore: ...` (which should also prevent any noise downstream).
Early work on a branch for this here. |
A more lightweight version of ArrayInterface would be awsome, I think - it is nowadays a direct or indirect dependency of a large chunk of the ecosystem. I've been tracking down some package load times, and ArrayInterface often turns out to be a driving factor. With a package like FiniteDiff, for example: julia> @time_imports using FiniteDiff
3.2 ms ┌ Compat
0.5 ms ┌ Requires
538.4 ms ┌ StaticArrays
0.2 ms ┌ IfElse
33.1 ms ┌ Static
722.1 ms ┌ ArrayInterface
1512.6 ms FiniteDiff ArrayInterface makes up the biggest chunk of the load time, StaticArrays almost all of the rest, FiniteDiff itself is tiny: julia> @time_imports using ArrayInterface, StaticArrays
4.2 ms ┌ Compat
0.7 ms ┌ Requires
0.2 ms ┌ IfElse
35.8 ms ┌ Static
717.2 ms ArrayInterface
560.6 ms StaticArrays
julia> @time_imports using FiniteDiff
19.6 ms FiniteDiff FIniteDiff (just an example here, though it is itself a direct or indirect dependency of many packages) seems to mostly use I was surprised how lightweight even a package like Optim itself is, here is Optim's "load time story": julia> @time using ArrayInterface
0.690469 seconds (656.96 k allocations: 36.272 MiB, 2.19% gc time, 91.62% compilation time)
julia> @time using StaticArrays
0.588908 seconds (2.28 M allocations: 179.212 MiB, 7.03% compilation time)
julia> @time using FillArrays
0.152951 seconds (412.35 k allocations: 31.434 MiB)
julia> @time using ForwardDiff
0.212571 seconds (548.34 k allocations: 36.217 MiB, 2.03% compilation time)
julia> @time using DataStructures
0.068310 seconds (127.42 k allocations: 9.279 MiB)
julia> @time using Optim
0.077643 seconds (207.88 k allocations: 13.644 MiB, 7.29% compilation time) |
What is |
It's new in Julia 1.8 (I've been using v1.8.0-beta3 above). It really opened my eyes about the package load times in general in the last days. |
I want to reduce the horrid load time of BAT.jl by reducing it's dependencies and splitting it up along dependency lines, so I started to measure load times of a bunch of diff- and opt- and stats-related packages and was quite surprised how lightweight some packages are (on top of their deps) and how heavy some others (bat/BAT.jl#351). |
The split is done. The core is nice. The rest needs work, but hey, that was the point of this issue so closed. julia> @time_imports using ArrayInterfaceCore
28.4 ms ArrayInterfaceCore
julia> @time_imports using ArrayInterface
2.8 ms ┌ Compat
1.0 ms ┌ IfElse
45.1 ms ┌ Static
2.3 ms ┌ ArrayInterfaceCore
398.1 ms ArrayInterface
julia> @time_imports using ArrayInterfaceBandedMatrices
120.3 ms ┌ FillArrays
1005.0 ms ┌ ArrayLayouts
214.9 ms ┌ BandedMatrices
2.3 ms ┌ ArrayInterfaceCore
1251.9 ms ArrayInterfaceBandedMatrices
julia> @time_imports using ArrayInterfaceBlockBandedMatrices
122.4 ms ┌ FillArrays
1017.1 ms ┌ ArrayLayouts
207.1 ms ┌ BandedMatrices
138.1 ms ┌ MatrixFactorizations
2.5 ms ┌ ArrayInterfaceCore
1.3 ms ┌ ArrayInterfaceBandedMatrices
154.6 ms ┌ BlockArrays
41.7 ms ┌ BlockBandedMatrices
1595.4 ms ArrayInterfaceBlockBandedMatrices
julia> @time_imports using ArrayInterfaceCUDA
2.6 ms ┌ Compat
1.5 ms ┌ Requires
52.3 ms ┌ ChainRulesCore
60.7 ms ┌ AbstractFFTs
15.7 ms ┌ Preferences
17.6 ms ┌ JLLWrappers
173.7 ms ┌ LLVMExtra_jll
1.2 ms ┌ Reexport
23.5 ms ┌ RandomNumbers
38.3 ms ┌ Random123
4.0 ms ┌ CEnum
100.4 ms ┌ LLVM
2.0 ms ┌ Adapt
799.5 ms ┌ GPUArrays
1.3 ms ┌ ExprTools
1.1 ms ┌ IfElse
34.2 ms ┌ Static
2.3 ms ┌ ArrayInterfaceCore
381.2 ms ┌ ArrayInterface
24.7 ms ┌ TimerOutputs
1.7 ms ┌ CompilerSupportLibraries_jll
6.2 ms ┌ BFloat16s
193.5 ms ┌ GPUCompiler
1059.9 ms ┌ CUDA
2855.1 ms ArrayInterfaceCUDA
julia> @time_imports using ArrayInterfaceOffsetArrays
2.8 ms ┌ Compat
1.3 ms ┌ IfElse
30.5 ms ┌ Static
2.2 ms ┌ ArrayInterfaceCore
327.2 ms ┌ ArrayInterface
1.6 ms ┌ Adapt
35.4 ms ┌ OffsetArrays
405.6 ms ArrayInterfaceOffsetArrays
julia> @time_imports using ArrayInterfaceStaticArrays
2.5 ms ┌ Compat
460.9 ms ┌ StaticArrays
1.1 ms ┌ IfElse
33.4 ms ┌ Static
2.5 ms ┌ ArrayInterfaceCore
333.8 ms ┌ ArrayInterface
1.8 ms ┌ Adapt
837.3 ms ArrayInterfaceStaticArrays
julia> @time_imports using ArrayInterfaceTracker
467.2 ms ┌ StaticArrays
478.1 ms ┌ DiffResults
2.9 ms ┌ Compat
1.3 ms ┌ NaNMath
1.3 ms ┌ Requires
53.7 ms ┌ ChainRulesCore
55.9 ms ┌ ChangesOfVariables
1.1 ms ┌ OpenLibm_jll
1.5 ms ┌ InverseFunctions
4.5 ms ┌ DocStringExtensions
5.3 ms ┌ IrrationalConstants
1.5 ms ┌ CompilerSupportLibraries_jll
2.0 ms ┌ LogExpFunctions
36.3 ms ┌ Preferences
38.6 ms ┌ JLLWrappers
187.3 ms ┌ OpenSpecFun_jll
203.6 ms ┌ SpecialFunctions
11.1 ms ┌ MacroTools
12.9 ms ┌ CommonSubexpressions
1.4 ms ┌ DiffRules
298.4 ms ┌ ForwardDiff
1.8 ms ┌ Adapt
2.4 ms ┌ ArrayInterfaceCore
26.4 ms ┌ NNlib
561.0 ms ┌ Tracker
1815.3 ms ArrayInterfaceTracker |
Amazing! |
So now we just need to ask a bunch of packages to switch deps? :-) |
CompatHelper will tell them ArrayInterface had a breaking update, and yes most packages should then choose to update to Core. |
One of the biggest remaining issues is that GPUArrays.jl is still heavy: So any package that removes Requires.jl is not going to cut down on using time until there's a better target than GPUArrays.jl. If that goes Core as well, then anything that's not static is almost instant. |
Could reduce the load time of FiniteDiff by over 50% (the remainder will be StaticArrays, which is harder to address). |
Yes, I'm doing those PRs now. RecursiveArrayTools first, and a bit of shaming on StaticArrays |
Amen to that! I did created an issue regarding it's load time (JuliaArrays/StaticArrays.jl#1023) - but to be honest I have to admit that I don't have a concrete proposal on how a lightweight StaticArraysBase.jl or so could look like, without behavior changes when the full one is loaded. |
See my comment. We only need 7 functions out of the whole thing, and then we delete the second biggest contributor to load times after GPUArrays.jl. Just look at this: julia> @time_imports using RecursiveArrayTools
10.1 ms ┌ MacroTools
18.7 ms ┌ ZygoteRules
2.6 ms ┌ Compat
123.9 ms ┌ FillArrays
521.7 ms ┌ StaticArrays
18.4 ms ┌ Preferences
20.2 ms ┌ JLLWrappers
173.5 ms ┌ LLVMExtra_jll
4.8 ms ┌ CEnum
103.7 ms ┌ LLVM
2.0 ms ┌ Adapt
815.3 ms ┌ GPUArrays
5.2 ms ┌ DocStringExtensions
40.3 ms ┌ RecipesBase
2.6 ms ┌ ArrayInterfaceCore
56.6 ms ┌ ChainRulesCore
1822.5 ms RecursiveArrayTools Fix those two and we go from |
Wow! We should send a big box of whatever is his favourite to @IanButterworth for JuliaLang/julia#41612 ! |
Cool! Nice to see it helping. If you haven't already, try it in nightly. It reports compilation time and recompilation if any (due to upstream invalidations) |
Oh, that could really ... encourage ... package maintainers to take a closer look at a few things. :-) |
Will that still go into 1.8 as well? |
It was added recently and is not marked for backport, but perhaps that's reasonable.. might be worth suggesting it |
As mentioned here,
ArrayInterface
has become a huge dependency to take on. It might make sense to separate out someArrayInterfaceCore
that can be loaded faster and contains only basic functionality.The text was updated successfully, but these errors were encountered: