Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistant package load time #28425

Closed
ufechner7 opened this issue Aug 3, 2018 · 34 comments
Closed

Inconsistant package load time #28425

ufechner7 opened this issue Aug 3, 2018 · 34 comments

Comments

@ufechner7
Copy link

ufechner7 commented Aug 3, 2018

The load time of packages depends on the order in which the packages are loaded. This should be either documented or fixed. Example:

Given a file with the name test.jl and the following content:

function load1()
    @eval using PyPlot
    @eval using CSV
end

function load2()
    @eval using CSV
    @eval using PyPlot
end

If I start Julia 0.7 RC2 and run:

julia> include("test.jl")
load2 (generic function with 1 method)

julia> @time load1()
  5.195306 seconds (15.34 M allocations: 808.064 MiB, 4.63% gc time) 

If I restart Julia and run:

julia> include("test.jl")
load2 (generic function with 1 method)

julia> @time load2()
  6.148743 seconds (21.50 M allocations: 1.083 GiB, 5.38% gc time)

This is inconsistant. It would mean we should load large packages first to reduce the total load time. For me this looks buggy. Also the allocations differ a lot.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Aug 3, 2018

I think this is an expected aspect of the way precompilation works (and why precompilation is quadratic in the number of packages). If you turn compiled modules off (julia --compiled-modules=no) then these should take similar times to load.

@oscardssmith
Copy link
Member

Why is precompile quadratic? Is it just to prevent npmification?

@vtjnash
Copy link
Member

vtjnash commented Aug 3, 2018

It's not. It's the same time-complexity of loading the modules with compiled modules off.

@JeffBezanson
Copy link
Member

IIUC, this issue is not about precompilation time but load time (once everything is already precompiled). Precompilation is slightly super-linear but loading is not.

The order probably matters due to method invalidations. Given that the difference is 1 second I really don't think this is a bug, and we should focus instead on just speeding up loading and precompiling as much as possible.

@ufechner7
Copy link
Author

Well, the difference for loading PyPlot as first or second package is 2.3s on my (very fast) PC, on a normal laptop the difference is probably 2-3 times as much:
Time for first loading:

julia> @time using PyPlot
  2.884559 seconds (8.71 M allocations: 442.837 MiB, 4.04% gc time)

Time for loading as second package:


julia> @time using CSV
  0.848260 seconds (2.88 M allocations: 187.064 MiB, 2.29% gc time)

julia> @time using PyPlot
  5.202185 seconds (17.81 M allocations: 882.974 MiB, 5.75% gc time)

If it is not a bug it should be documented that large packages should be loaded first.

@JeffBezanson
Copy link
Member

I really think it's overly fussy to tell people to load large packages first. You don't generally know how big a package is, and size is only loosely correlated to method invalidations and load time. Furthermore, this phenomenon doesn't just affect package loading but future compilation as well. If the time to run some code after loading packages is included, the effect might be different. It just seems crazy to try to guess what might make compilation take 2 seconds longer.

Here's an idea: why don't we just make package loading 10x faster, so that the difference is 0.2sec instead of 2sec? Would that still be a problem?

@ufechner7
Copy link
Author

If this is a realistic goal, then indeed this issue doesn't matter.

@JeffBezanson
Copy link
Member

Of course, I forgot to mention the obvious alternative, which is to make loading slower so that the times are more similar. For example we could call sleep to make sure each package takes at least say, 10 seconds to load. Then this issue could be closed :-P

@tknopp
Copy link
Contributor

tknopp commented Aug 3, 2018

+1 to making package load time 10x faster. Looking forward to this.

@saschatimme
Copy link
Contributor

saschatimme commented Aug 11, 2018

The order probably matters due to method invalidations. Given that the difference is 1 second I really don't think this is a bug, and we should focus instead on just speeding up loading and precompiling as much as possible.

@JeffBezanson I think this should be noted a bug, since I encountered a case where it is indeed a difference of over 300 seconds. Also it seems that loading times of packages which have common dependencies increased dramatically in 1.0.

0.7:

julia> versioninfo()
Julia Version 0.7.0
Commit a4cb80f3ed (2018-08-08 06:46 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 4

julia> @time using HomotopyContinuation, PolynomialTestSystems
  4.948004 seconds (12.45 M allocations: 689.326 MiB, 8.61% gc time)

# In a new session
julia> @time using DynamicPolynomials, MultivariatePolynomials
  2.757743 seconds (8.49 M allocations: 418.643 MiB, 7.19% gc time)

# another session
julia> @time using MultivariatePolynomials, DynamicPolynomials
  2.083555 seconds (5.41 M allocations: 263.049 MiB, 5.66% gc time)

1.0:

julia> versioninfo()
Julia Version 1.0.0
Commit 5d4eaca0c9 (2018-08-08 20:58 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 4

julia> @time using HomotopyContinuation, PolynomialTestSystems
324.221565 seconds (615.82 M allocations: 35.967 GiB, 5.71% gc time)
# in a new session
julia> @time using DynamicPolynomials, MultivariatePolynomials
 20.442672 seconds (45.11 M allocations: 2.419 GiB, 3.39% gc time)
# another session
julia> @time using MultivariatePolynomials, DynamicPolynomials
  1.880315 seconds (5.14 M allocations: 250.673 MiB, 5.42% gc time)

Note that the loading time is normal if the using statements are on different lines in the REPL. But I intentional put them on the same line since this is the same behaviour I encounter when I load the packages in a script.

Here is gist with the Manifest.toml and Project.toml to reproduce this.

cc: @blegat

@saschatimme
Copy link
Contributor

Some more timings on 1.0

julia> @time using HomotopyContinuation
  1.491884 seconds (2.98 M allocations: 222.024 MiB, 8.96% gc time)

#new session
julia> @time using PolynomialTestSystems, HomotopyContinuation
 21.043676 seconds (47.94 M allocations: 2.624 GiB, 3.65% gc time)

#new session
julia> time using PolynomialTestSystems
 @time 0.352957 seconds (756.73 k allocations: 44.335 MiB, 4.77% gc time)

julia> @time using HomotopyContinuation
  2.507023 seconds (6.33 M allocations: 375.173 MiB, 6.18% gc time)

@KristofferC
Copy link
Member

Note that the loading time is normal if the using statements are on different lines in the REPL.

Ref #27549.

@saschatimme
Copy link
Contributor

Some more investigation. Assume I have the file import-test.jl

import DynamicPolynomials
sleep(0.1)
import MultivariatePolynomials

Then I get (again on 1.0):

julia> @time include("import-test.jl");
  3.297908 seconds (10.88 M allocations: 536.594 MiB, 6.95% gc time)

whereas when I remove the sleep(0.1) line I get

julia> @time include("import-test.jl");
 20.558687 seconds (45.11 M allocations: 2.419 GiB, 3.57% gc time)

@blegat
Copy link
Contributor

blegat commented Aug 30, 2018

One example for which inverting the order makes it 50 times slower (note: it require JuMP master and PolyJuMP master, it uses the Julia v1.0 binaries for linux downloaded from the julialang website):

$ time julia -e 'using Compat; using PolyJuMP'
julia -e   1.90s user 0.67s system 137% cpu 1.872 totalj
$ time julia -e 'using PolyJuMP; using Compat'   
julia -e   93.51s user 0.74s system 100% cpu 1:33.74 total

As a rule of thumb, it seems that these issues always occur when loading a package that was already loaded inside another package.
For instance here Compat is already loaded by PolyJuMP. I have also checked removing the using Compat statement in PolyJuMP but the issues persists because Compat is loaded by JuMP which is loaded by PolyJuMP.
@saschatimme's example is consistent with this rule of thumb since MultivariatePolynomials is loaded by DynamicPolynomials.

@tknopp
Copy link
Contributor

tknopp commented Aug 30, 2018

could some of you test the suggestion in #28607 (comment) ?

@blegat
Copy link
Contributor

blegat commented Aug 30, 2018

For the code

@time using DynamicPolynomials
@time using MultivariatePolynomials

saved in a file. Executing the file with Julia v1.0 binaries:

  0.554849 seconds (639.39 k allocations: 38.740 MiB)
 27.112807 seconds (43.61 M allocations: 2.335 GiB, 3.35% gc time)

Executing the code from master (more precisely 2346be8)

  6.229270 seconds (1.18 M allocations: 65.632 MiB, 0.12% gc time)
  2.241962 seconds (7.52 M allocations: 364.837 MiB, 8.25% gc time)

After reverting 614d917 (i.e. PR #28290)

  6.575796 seconds (1.17 M allocations: 65.516 MiB, 0.05% gc time)
  0.004358 seconds (237 allocations: 12.641 KiB, 97.13% gc time)

@saschatimme
Copy link
Contributor

@blegat Good to know that it is already better on master! If you already have everything setup, can you bisect from 0.7 on to see which commit „fixed“ the worst behavior such that we can make sure this will be backported to 1.0?

@blegat
Copy link
Contributor

blegat commented Aug 31, 2018

I have done a bisection, the commits that fixes the issue with the example:

@time using DynamicPolynomials
@time using MultivariatePolynomials

is 472fe5f.
Indeed, with the commit 48a9ec4, I have

  0.353760 seconds (636.32 k allocations: 38.602 MiB)
 18.815349 seconds (43.61 M allocations: 2.335 GiB, 3.87% gc time)

and with the commit 472fe5f, I have:

  0.339032 seconds (604.47 k allocations: 36.739 MiB)
  2.215508 seconds (7.34 M allocations: 356.452 MiB, 7.13% gc time)

@KristofferC
Copy link
Member

I must ask... are you sure? It seems crazy that that commit would have such an effect.

@StefanKarpinski
Copy link
Member

That does seem very unlikely to be the actual cause.

@saschatimme
Copy link
Contributor

@KristofferC @StefanKarpinski I redid the bisect and landed on the same commit as @blegat.

@KristofferC
Copy link
Member

I get

julia> versioninfo()
Julia Version 1.1.0-DEV.26
Commit 48a9ec4083* (2018-08-15 18:05 UTC)

julia> @time using DynamicPolynomials
  0.408226 seconds (658.66 k allocations: 39.622 MiB, 3.62% gc time)

julia> @time using MultivariatePolynomials
  1.701578 seconds (4.48 M allocations: 212.359 MiB, 4.31% gc time)

@blegat
Copy link
Contributor

blegat commented Sep 17, 2018

@KristofferC You need to do @time using DynamicPolynomials; @time using MultivariatePolynomials or storing the two lines in a file and including the file for the bug to occur, see @saschatimme note in #28425 (comment):

Note that the loading time is normal if the using statements are on different lines in the REPL. But I intentional put them on the same line since this is the same behaviour I encounter when I load the packages in a script.

@KristofferC
Copy link
Member

Oh, missed that.

@KristofferC
Copy link
Member

KristofferC commented Sep 17, 2018

Ok, I can repro and this is pretty WTF to me...

Here are some profile logs from profiling the second using if that can be helpful to someone https://gist.github.com/KristofferC/f0269c7c4181ccfed61eb679d83b591b...

Im guessting that some backedge was broken in that PR that caused a lot less to have to be recompiled or something...

@saschatimme
Copy link
Contributor

saschatimme commented Sep 27, 2018

Things are still not working under 1.0.1-pre.4 although the bisection commit is backported.

   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.1-pre.4 (2018-09-08)
 _/ |\__'_|_|_|\__'_|  |  release-1.0/dc40b6f89e (fork: 5 commits, 49 days)
|__/                   |

julia> @time using DynamicPolynomials, MultivariatePolynomials
 43.330891 seconds (71.10 M allocations: 3.859 GiB, 2.97% gc time)

~/coding/julia-src/julia $ ./julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.1-pre.4 (2018-09-08)
 _/ |\__'_|_|_|\__'_|  |  release-1.0/dc40b6f89e (fork: 5 commits, 49 days)
|__/                   |

julia> @time using MultivariatePolynomials, DynamicPolynomials
  2.941291 seconds (6.68 M allocations: 332.664 MiB, 4.86% gc time)

BUT I noticed that although the second statement only states 2.9 seconds it takes probably the same amount of time if you count the time between pressing enter until the REPL is working again. So the 2.9 seconds timing is definitely false. @KristofferC do you have any idea how this false reporting can happen?

I really hope that this issue can be sorted out before the 1.0.1 release since I am basically stuck with 0.7 until it is fixed (and I have to tell everybody who wants to use my software that they should use 0.7 since 1.0 is broken). Also note that this behaviour does not only affects this combination of packages but rather all packages which specialize a lot methods from some other package.

@mauro3
Copy link
Contributor

mauro3 commented Sep 27, 2018

You're probably out of luck as 1.0.1 is essentially done: https://discourse.julialang.org/t/julia-1-0-1-testing-period/15534

@KristofferC
Copy link
Member

We should probably try reverting #28290 but fix up the issues the original PR had.

@saschatimme
Copy link
Contributor

It turns out, I somehow didn't pull the latest version of the release 1.0 branch. To be precise with release-1.0/398d87829d everything seems to work as expected.

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.1-pre.171 (2018-09-26)
 _/ |\__'_|_|_|\__'_|  |  release-1.0/398d87829d (fork: 172 commits, 49 days)
|__/                   |

julia> @time using DynamicPolynomials, MultivariatePolynomials;
  4.385238 seconds (9.58 M allocations: 497.144 MiB, 3.90% gc time)

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.1-pre.171 (2018-09-26)
 _/ |\__'_|_|_|\__'_|  |  release-1.0/398d87829d (fork: 172 commits, 49 days)
|__/                   |

julia> @time using MultivariatePolynomials, DynamicPolynomials;
  3.057933 seconds (6.67 M allocations: 330.516 MiB, 4.56% gc time)

I am very sorry for the additional noise here and assume / hope that this issue is now resolved :)

@blegat
Copy link
Contributor

blegat commented Sep 28, 2018

It also solves one other case StatsBase and HomotopyContinuation

$ time julia -e 'using StatsBase ; using HomotopyContinuation'
julia -e 'using StatsBase ; using HomotopyContinuation'  1.84s user 0.56s system 129% cpu 1.852 total
$ time julia -e 'using HomotopyContinuation ; using StatsBase'
julia -e 'using HomotopyContinuation ; using StatsBase'  278.70s user 0.84s system 100% cpu 4:39.51 total
$ time /home/blegat/git/julia/julia -e 'using StatsBase ; using HomotopyContinuation'
/home/blegat/git/julia/julia -e 'using StatsBase ; using HomotopyContinuation  2.06s user 0.55s system 120% cpu 2.159 total
$ time /home/blegat/git/julia/julia -e 'using HomotopyContinuation ; using StatsBase'
/home/blegat/git/julia/julia -e 'using HomotopyContinuation ; using StatsBase  4.90s user 0.55s system 109% cpu 4.999 total
$ julia
julia> versioninfo()
Julia Version 1.0.0
Commit 5d4eaca0c9 (2018-08-08 20:58 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)

$ /home/blegat/git/julia/julia 
julia> versioninfo()
Julia Version 1.0.1-pre.171
Commit 398d87829d (2018-09-26 22:52 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)

@ufechner7
Copy link
Author

The situation became a lot worse for me: The load time for the package CSV increased from 0.84s to 8.8s, more than 10 times. The reason might be changes in the CSV package, but if small changes to a package cause such an increase of the load time I would call it "Inconsistant package load time". This makes Julia 1.x currently unusable for me. :(

@KristofferC
Copy link
Member

I thought CSV got almost completely rewritten? Does the same CSV version you tested before also show regressions?

@ufechner7
Copy link
Author

ufechner7 commented Oct 13, 2018

Even if I do
add CSV@0.3.0
and use the oldest version of CSV that can be used on 0.7 without warnings, I still have more than 4s package load time, where it was 0.84s before.
So the reason cannot only be the major rewrite, which happen after version 0.3.0.

And on my laptop the load time is now about 15s with CSV v0.4.1 and still 7s with CSV v0.3.0, where it was about 2s with CVS 2.5 and Julia 0.6.

@KristofferC
Copy link
Member

Ok so you have two versions of julia with the same package version and much worse load time now? Then we should bisecting and try find the commit that introduced the regression.

@vtjnash vtjnash closed this as completed Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants