-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistant package load time #28425
Comments
I think this is an expected aspect of the way precompilation works (and why precompilation is quadratic in the number of packages). If you turn compiled modules off ( |
Why is precompile quadratic? Is it just to prevent npmification? |
It's not. It's the same time-complexity of loading the modules with compiled modules off. |
IIUC, this issue is not about precompilation time but load time (once everything is already precompiled). Precompilation is slightly super-linear but loading is not. The order probably matters due to method invalidations. Given that the difference is 1 second I really don't think this is a bug, and we should focus instead on just speeding up loading and precompiling as much as possible. |
Well, the difference for loading PyPlot as first or second package is 2.3s on my (very fast) PC, on a normal laptop the difference is probably 2-3 times as much:
Time for loading as second package:
If it is not a bug it should be documented that large packages should be loaded first. |
I really think it's overly fussy to tell people to load large packages first. You don't generally know how big a package is, and size is only loosely correlated to method invalidations and load time. Furthermore, this phenomenon doesn't just affect package loading but future compilation as well. If the time to run some code after loading packages is included, the effect might be different. It just seems crazy to try to guess what might make compilation take 2 seconds longer. Here's an idea: why don't we just make package loading 10x faster, so that the difference is 0.2sec instead of 2sec? Would that still be a problem? |
If this is a realistic goal, then indeed this issue doesn't matter. |
Of course, I forgot to mention the obvious alternative, which is to make loading slower so that the times are more similar. For example we could call |
+1 to making package load time 10x faster. Looking forward to this. |
@JeffBezanson I think this should be noted a bug, since I encountered a case where it is indeed a difference of over 300 seconds. Also it seems that loading times of packages which have common dependencies increased dramatically in 1.0. 0.7: julia> versioninfo()
Julia Version 0.7.0
Commit a4cb80f3ed (2018-08-08 06:46 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin14.5.0)
CPU: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
JULIA_NUM_THREADS = 4
julia> @time using HomotopyContinuation, PolynomialTestSystems
4.948004 seconds (12.45 M allocations: 689.326 MiB, 8.61% gc time)
# In a new session
julia> @time using DynamicPolynomials, MultivariatePolynomials
2.757743 seconds (8.49 M allocations: 418.643 MiB, 7.19% gc time)
# another session
julia> @time using MultivariatePolynomials, DynamicPolynomials
2.083555 seconds (5.41 M allocations: 263.049 MiB, 5.66% gc time) 1.0: julia> versioninfo()
Julia Version 1.0.0
Commit 5d4eaca0c9 (2018-08-08 20:58 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin14.5.0)
CPU: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
JULIA_NUM_THREADS = 4
julia> @time using HomotopyContinuation, PolynomialTestSystems
324.221565 seconds (615.82 M allocations: 35.967 GiB, 5.71% gc time)
# in a new session
julia> @time using DynamicPolynomials, MultivariatePolynomials
20.442672 seconds (45.11 M allocations: 2.419 GiB, 3.39% gc time)
# another session
julia> @time using MultivariatePolynomials, DynamicPolynomials
1.880315 seconds (5.14 M allocations: 250.673 MiB, 5.42% gc time) Note that the loading time is normal if the using statements are on different lines in the REPL. But I intentional put them on the same line since this is the same behaviour I encounter when I load the packages in a script. Here is gist with the Manifest.toml and Project.toml to reproduce this. cc: @blegat |
Some more timings on 1.0 julia> @time using HomotopyContinuation
1.491884 seconds (2.98 M allocations: 222.024 MiB, 8.96% gc time)
#new session
julia> @time using PolynomialTestSystems, HomotopyContinuation
21.043676 seconds (47.94 M allocations: 2.624 GiB, 3.65% gc time)
#new session
julia> time using PolynomialTestSystems
@time 0.352957 seconds (756.73 k allocations: 44.335 MiB, 4.77% gc time)
julia> @time using HomotopyContinuation
2.507023 seconds (6.33 M allocations: 375.173 MiB, 6.18% gc time) |
Ref #27549. |
Some more investigation. Assume I have the file import DynamicPolynomials
sleep(0.1)
import MultivariatePolynomials Then I get (again on 1.0): julia> @time include("import-test.jl");
3.297908 seconds (10.88 M allocations: 536.594 MiB, 6.95% gc time) whereas when I remove the julia> @time include("import-test.jl");
20.558687 seconds (45.11 M allocations: 2.419 GiB, 3.57% gc time) |
One example for which inverting the order makes it 50 times slower (note: it require JuMP master and PolyJuMP master, it uses the Julia v1.0 binaries for linux downloaded from the julialang website): $ time julia -e 'using Compat; using PolyJuMP'
julia -e 1.90s user 0.67s system 137% cpu 1.872 totalj
$ time julia -e 'using PolyJuMP; using Compat'
julia -e 93.51s user 0.74s system 100% cpu 1:33.74 total As a rule of thumb, it seems that these issues always occur when loading a package that was already loaded inside another package. |
could some of you test the suggestion in #28607 (comment) ? |
For the code @time using DynamicPolynomials
@time using MultivariatePolynomials saved in a file. Executing the file with Julia v1.0 binaries:
Executing the code from master (more precisely 2346be8)
After reverting 614d917 (i.e. PR #28290)
|
@blegat Good to know that it is already better on master! If you already have everything setup, can you bisect from 0.7 on to see which commit „fixed“ the worst behavior such that we can make sure this will be backported to 1.0? |
I have done a bisection, the commits that fixes the issue with the example: @time using DynamicPolynomials
@time using MultivariatePolynomials is 472fe5f.
and with the commit 472fe5f, I have:
|
I must ask... are you sure? It seems crazy that that commit would have such an effect. |
That does seem very unlikely to be the actual cause. |
@KristofferC @StefanKarpinski I redid the bisect and landed on the same commit as @blegat. |
I get julia> versioninfo()
Julia Version 1.1.0-DEV.26
Commit 48a9ec4083* (2018-08-15 18:05 UTC)
julia> @time using DynamicPolynomials
0.408226 seconds (658.66 k allocations: 39.622 MiB, 3.62% gc time)
julia> @time using MultivariatePolynomials
1.701578 seconds (4.48 M allocations: 212.359 MiB, 4.31% gc time) |
@KristofferC You need to do
|
Oh, missed that. |
Ok, I can repro and this is pretty WTF to me... Here are some profile logs from profiling the second Im guessting that some backedge was broken in that PR that caused a lot less to have to be recompiled or something... |
Things are still not working under 1.0.1-pre.4 although the bisection commit is backported. _ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.0.1-pre.4 (2018-09-08)
_/ |\__'_|_|_|\__'_| | release-1.0/dc40b6f89e (fork: 5 commits, 49 days)
|__/ |
julia> @time using DynamicPolynomials, MultivariatePolynomials
43.330891 seconds (71.10 M allocations: 3.859 GiB, 2.97% gc time)
~/coding/julia-src/julia $ ./julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.0.1-pre.4 (2018-09-08)
_/ |\__'_|_|_|\__'_| | release-1.0/dc40b6f89e (fork: 5 commits, 49 days)
|__/ |
julia> @time using MultivariatePolynomials, DynamicPolynomials
2.941291 seconds (6.68 M allocations: 332.664 MiB, 4.86% gc time) BUT I noticed that although the second statement only states 2.9 seconds it takes probably the same amount of time if you count the time between pressing enter until the REPL is working again. So the 2.9 seconds timing is definitely false. @KristofferC do you have any idea how this false reporting can happen? I really hope that this issue can be sorted out before the 1.0.1 release since I am basically stuck with 0.7 until it is fixed (and I have to tell everybody who wants to use my software that they should use 0.7 since 1.0 is broken). Also note that this behaviour does not only affects this combination of packages but rather all packages which specialize a lot methods from some other package. |
You're probably out of luck as 1.0.1 is essentially done: https://discourse.julialang.org/t/julia-1-0-1-testing-period/15534 |
We should probably try reverting #28290 but fix up the issues the original PR had. |
It turns out, I somehow didn't pull the latest version of the release 1.0 branch. To be precise with _
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.0.1-pre.171 (2018-09-26)
_/ |\__'_|_|_|\__'_| | release-1.0/398d87829d (fork: 172 commits, 49 days)
|__/ |
julia> @time using DynamicPolynomials, MultivariatePolynomials;
4.385238 seconds (9.58 M allocations: 497.144 MiB, 3.90% gc time)
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.0.1-pre.171 (2018-09-26)
_/ |\__'_|_|_|\__'_| | release-1.0/398d87829d (fork: 172 commits, 49 days)
|__/ |
julia> @time using MultivariatePolynomials, DynamicPolynomials;
3.057933 seconds (6.67 M allocations: 330.516 MiB, 4.56% gc time) I am very sorry for the additional noise here and assume / hope that this issue is now resolved :) |
It also solves one other case StatsBase and HomotopyContinuation
|
The situation became a lot worse for me: The load time for the package CSV increased from 0.84s to 8.8s, more than 10 times. The reason might be changes in the CSV package, but if small changes to a package cause such an increase of the load time I would call it "Inconsistant package load time". This makes Julia 1.x currently unusable for me. :( |
I thought CSV got almost completely rewritten? Does the same CSV version you tested before also show regressions? |
Even if I do And on my laptop the load time is now about 15s with CSV v0.4.1 and still 7s with CSV v0.3.0, where it was about 2s with CVS 2.5 and Julia 0.6. |
Ok so you have two versions of julia with the same package version and much worse load time now? Then we should bisecting and try find the commit that introduced the regression. |
The load time of packages depends on the order in which the packages are loaded. This should be either documented or fixed. Example:
Given a file with the name test.jl and the following content:
If I start Julia 0.7 RC2 and run:
If I restart Julia and run:
This is inconsistant. It would mean we should load large packages first to reduce the total load time. For me this looks buggy. Also the allocations differ a lot.
The text was updated successfully, but these errors were encountered: