Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-deterministic allocations in function call #27768

Closed
pablosanjose opened this issue Jun 25, 2018 · 9 comments
Closed

Non-deterministic allocations in function call #27768

pablosanjose opened this issue Jun 25, 2018 · 9 comments
Labels
compiler:inference Type inference compiler:precompilation Precompilation of modules regression Regression in behavior compared to a previous version

Comments

@pablosanjose
Copy link
Contributor

I've run into a very annoying problem whereby a certain package I'm writing would be fast on a fresh REPL, showing few allocations, and them on a different fresh REPL, it would be very very slow, with a huge number of allocations... without touching the code! This is an effort to isolate the problem into a MWE (not sure if it can be made even smaller). To reproduce the issue you will likely need to try several times. For me the behaviour is highly erratic.

So: the following module elsa, when developed in ./julia/dev, produces non-deterministic allocations when running test() in current 0.7-beta (I'm not sure when this began)

module elsa
    function test()  
        for i in 1:10_000
            foo!([0.0], 1, 1, 1, 1, rand(), 1, 1, (1, 1))   
        end   
    end 
    @noinline function foo!(vec, x1, x2, x3, x4, x5, x6, x7, (i, j))
        vec[i] = i 
        return nothing 
    end 
end

First run in a fresh REPL

   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.7.0-beta.7 (2018-06-25 01:11 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 0978251ac5 (0 days old master)
|__/                   |  x86_64-apple-darwin17.6.0

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
[ Info: Recompiling stale cache file /Users/pablo/.julia/compiled/v0.7/elsa/Jexn.ji for module elsa
  0.034960 seconds (122.36 k allocations: 5.917 MiB)
  0.007761 seconds (40.00 k allocations: 1.679 MiB, 69.58% gc time)
  0.001392 seconds (40.00 k allocations: 1.679 MiB)

Subsequent runs in a fresh REPL (no recompile)

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.025599 seconds (107.65 k allocations: 5.603 MiB)
  0.000618 seconds (20.00 k allocations: 1.068 MiB)
  0.000663 seconds (20.00 k allocations: 1.068 MiB)

Note the change in execution time and allocations.

Sometimes, on a subsequent run, we again get 40k allocations

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.029740 seconds (122.36 k allocations: 5.917 MiB)
  0.001366 seconds (40.00 k allocations: 1.679 MiB)
  0.001490 seconds (40.00 k allocations: 1.679 MiB)

If one modifies the code with e.g. some whitespace to force a recompile, I can trigger a change of allocations much more frequently.

Fresh REPL after adding whitespace to force recompile

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
[ Info: Recompiling stale cache file /Users/pablo/.julia/compiled/v0.7/elsa/Jexn.ji for module elsa
  0.025894 seconds (107.69 k allocations: 5.596 MiB)
  0.000819 seconds (20.00 k allocations: 1.068 MiB)
  0.005586 seconds (20.00 k allocations: 1.068 MiB, 86.66% gc time)

Second run in a fresh REPL (no recompile)

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.062763 seconds (122.37 k allocations: 5.923 MiB, 53.05% gc time)
  0.001380 seconds (40.00 k allocations: 1.679 MiB)
  0.001364 seconds (40.00 k allocations: 1.679 MiB)

Third run

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.021969 seconds (107.69 k allocations: 5.596 MiB)
  0.000398 seconds (20.00 k allocations: 1.068 MiB)
  0.000440 seconds (20.00 k allocations: 1.068 MiB)

You get the idea.

Incidentally, if one runs the module straight at the REPL, I never see the problem, and allocations are just 10k (half of the minimum above)

julia> module elsa
           function test()  
               for i in 1:10_000
                   foo!([0.0], 1, 1, 1, 1, rand(), 1, 1, (1, 1))   
               end   
           end 
           @noinline function foo!(vec, x1, x2, x3, x4, x5, x6, x7, (i, j))
               vec[i] = i 
               return nothing 
           end 
       end 
Main.elsa

julia> @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.023286 seconds (90.55 k allocations: 5.052 MiB)
  0.000304 seconds (10.00 k allocations: 937.656 KiB)
  0.000333 seconds (10.00 k allocations: 937.656 KiB)

X-ref: #25900 (I'm not sure it is related, though, as the using elsa itself is deterministic in this example)

julia> versioninfo()
Julia Version 0.7.0-beta.7
Commit 0978251ac5 (2018-06-25 01:11 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin17.6.0)
  CPU: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
@KristofferC
Copy link
Member

KristofferC commented Jun 25, 2018

Seems to be a type inference issue that has to do with precompilation.

10k allocations (no precompile):

  20 ─       invoke Elsa.foo!(%11::Array{Float64,1}, 1::Int64, 1::Int64, 1::Int64, 1::Int64, %42::Float64, 1::Int64, 1::Int64, (1, 1)::Tuple{Int64,Int64})

Every argument inferred

20k allocations (after first precompile)

  20 ─       invoke Elsa.foo!(%11::Any, 1::Any, 1::Int64, 1::Int64, 1::Int64, %42::Float64, 1::Int64, 1::Int64, (1, 1)::Tuple{Int64,Int64})

Some arguments not inferred

40k allocations (recompile):

  20 ─       invoke Elsa.foo!(%11::Array{Float64,1}, 1::Int64, 1::Int64, 1::Int64, 1::Any, %42::Float64, 1::Int64, 1::Any, (1, 1)::Any)

Some other arguments not inferred...

Very nice MWE

Even something even smaller like

test() = foo!([0.0], 1)
@noinline foo!(vec, i) = vec[i] = i

can show this:

julia> @code_warntype Elsa.test()
Body::Int64
6 1 ─ %1 = invoke Base.vect(0.0::Float64)::Array{Float64,1}                                                                                                                                                     │
  │   %2 = invoke Elsa.foo!(%1::Array{Float64,1}, 1::Any)::Int64                                                                                                                                                │
  └──      return %2 

note, 1::Any

@KristofferC KristofferC added compiler:precompilation Precompilation of modules compiler:inference Type inference labels Jun 25, 2018
@pablosanjose
Copy link
Contributor Author

Nice! Thanks for looking into it @KristofferC. So it's got to do with inference, but do you know why it fails randomly? It does seem to be related to the tuple destructuring.

@KristofferC
Copy link
Member

KristofferC commented Jun 25, 2018

I don't think it has to do anything with the tuple, since I can get the ::Any without any tuples (see my last example).

Also, cannot reproduce on 0.6.

@KristofferC KristofferC added the regression Regression in behavior compared to a previous version label Jun 25, 2018
@pablosanjose
Copy link
Contributor Author

Uhm, I do see how your ultraminimalistic example fails to be inferred, but it doesn't seem to allocate randomly for me. It always behaves the same in that regard.

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.016478 seconds (62.47 k allocations: 3.216 MiB)
  0.000004 seconds (5 allocations: 256 bytes)
  0.000002 seconds (5 allocations: 256 bytes)

Do you see the inference failure with your MWE always, or randomly?

@KristofferC
Copy link
Member

It's a bit random, sometimes the first argument fails to infer aswell

julia> @code_warntype Elsa.test()
Body::Int64
6 1 ─ %1 = invoke Base.vect(0.0::Float64)::Array{Float64,1}                                                                                                                                                     │
  │   %2 = invoke Elsa.foo!(%1::Any, 1::Any)::Int64                                                                                                                                                             │
  └──      return %2 

@StefanKarpinski
Copy link
Member

cc @vtjnash

@JeffBezanson JeffBezanson added this to the 1.0.x milestone Jul 4, 2018
@vtjnash
Copy link
Member

vtjnash commented Jul 4, 2018

May be fixed (at least the inference failure problems) by 4562ae2

@vtjnash vtjnash closed this as completed Jul 4, 2018
@pablosanjose
Copy link
Contributor Author

Amazing, absolutely genius @vtjnash. Completely solved AFAICT! Not only the MWE, but the real world case too, it seems.

It may be too complicated to explain briefly, but did you understand the root cause (the non-deterministic behaviour in particular)?

@vtjnash
Copy link
Member

vtjnash commented Jul 6, 2018

Yeah, I saw some uninitialized memory when doing some unrelated work, and traced it back to here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:inference Type inference compiler:precompilation Precompilation of modules regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

5 participants