-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc why julia sometimes can't save inference results #40
Comments
It might be worth documenting, but I think no one besides Jameson and Jeff understand the full story here. My own understanding extends to JuliaLang/julia#32705 and that's pretty much it. In the case of failure on Gadfly, what does |
hah, indeed, running |
That is a long time. For those methods with long inference times, it maybe worth digging deeper. What are the inference times of the methods it calls? Can you break it up in a way that allows more caching? Does adding precompile statements to packages it uses help? For example, if |
With the feature-freeze happening in a few days, you could check and find out. My assessment is it's not likely to be as rapid as any of us hope; it's a multifaceted problem. |
time to first plot is exactly the same with 1.4. i suppose i should be happy it isn't slower. i don't even know where to begin to " break it up in a way that allows more caching" as i don't understand at all how things work underneath the hood. at least there is a very simple and quick test:
the function for which inference takes the longest is |
It doesn't. As soon as it knows which method you're calling (which it figures out within ~microseconds), it looks up the code for the method and then starts inferring the type of every variable. If it successfully figures out the types of all the arguments to the first function call, it then recurses into that function and figures everything out about it, including the return type. This is fully recursive: every single method that it can determine in advance will be fully inferred, all the way to the bottom of the call chain. Only if inference fails does this terminate early. And all that means is that when it executes the method, it will run up until the point of faillure, then use run-time dispatch to figure out what methods get called, and that will then trigger inference on all their dependent functions. So eventually you'll have to infer everything anyway. So that 10s is "only" for the call to I haven't had time to poke at Gadfly, but I should have mentioned one other strategy: adding |
I have a related problem. I have a package which does not include any precompile directive yet. When I run inf_timing = @snoopi tmin=0.01 include("snoopfile.jl") for some The bad news are that these methods do not appear in any precompile directive when I parcel the I have run the logger (as indicated here), and the reason they are skipped is "skipping due to evail failure". I can see that this message is printed in this line: SnoopCompile.jl/src/parcel_snoopi.jl Line 151 in 40c840e
but I am unable to figure out how to work-around this. Any Idea on how to fix the "evail failure" so that these methods can be eventually precompiled? Thanks! |
The most common explanation is also described in that section; most likely the "ownership" of signature Of course, it's also possible that SnoopCompile's algorithm for determining ownership needs improvement. You can test this for yourself by loading the requisite packages and calling |
The methods that are skipped are defined in sub-modules of my package (and not exported in the main module of the package). Can this be the reason why |
That seems likely to be a bug. Can you post a simple reproducer? That is a situation that SnoopCompile should be able to handle, with proper scoping. |
Yes of course! |
I have the MWE, consider this folder tree:
And these files: # MyPackage.jl
module MyPackage
module ModuleA
export Foo
struct Foo end
end # module
module ModuleB
using MyPackage.ModuleA
export hello
hello(::Foo) = "hello"
end # module
end # module # snoopfile.jl
using MyPackage.ModuleA
using MyPackage.ModuleB
foo = Foo()
hello(foo) # snoop.jl
using SnoopCompile
using Base.CoreLogging
logger = SimpleLogger(stderr, CoreLogging.Debug);
inf_timing = @snoopi include("MyPackage/snoopfile.jl")
pc = with_logger(logger) do
SnoopCompile.parcel(inf_timing)
end
SnoopCompile.write("precompile", pc) Now, develop julia> include("snoop.jl")
┌ Debug: Module MyPackage: skipping Tuple{Type{Foo}} due to eval failure
└ @ SnoopCompile /home/fverdugo/.julia/packages/SnoopCompile/qyn6g/src/parcel_snoopi.jl:151
┌ Debug: Module MyPackage: skipping Tuple{typeof(hello),Foo} due to eval failure
└ @ SnoopCompile /home/fverdugo/.julia/packages/SnoopCompile/qyn6g/src/parcel_snoopi.jl:151 Moreover, no precompile statement was generated. Something to do with sub-modules, and names defined in different sub-modules? More info: julia> versioninfo()
Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
JULIAROOT = /home/fverdugo/Apps/julia/julia-1.3.1
(snoopi_mwe) pkg> st
Status `~/Code/jl/test/snoopi_mwe/Project.toml`
[6c4e32b7] MyPackage v0.0.0 [`MyPackage`]
[aa65fe97] SnoopCompile v1.1.0 |
In my example, the ownership can be reduced to a single package, but not to a single module, since there are two sub-modules involved. Is this the problem? |
Is it the intended behavior in |
Very interesting. If you change import MyPackage
foo = MyPackage.ModuleA.Foo()
MyPackage.ModuleB.hello(foo) then everything works: shell> cat precompile/precompile_MyPackage.jl
function _precompile_()
ccall(:jl_generating_output, Cint, ()) == 1 || return nothing
precompile(Tuple{Type{MyPackage.ModuleA.Foo}})
precompile(Tuple{typeof(MyPackage.ModuleB.hello),MyPackage.ModuleA.Foo})
end This is basically a consequence of the following difference in printing (ref JuliaLang/julia#23806): tim@diva:/tmp/sc$ julia -q
julia> using MyPackage
julia> MyPackage.ModuleA.Foo()
MyPackage.ModuleA.Foo()
julia> exit()
tim@diva:/tmp/sc$ julia -q
julia> using MyPackage.ModuleA
julia> Foo()
Foo() Fortunately, this proves to be easily fixable with |
Great! With your trick, I get the precompiles for my real package. |
Glad it helps. One request: next time, please open a fresh issue, as your issue was quite different from the OP. #59 will close this, but I am not certain that the OP's request has been fully satisfied. (But I'm not sure what to do about it, which is why I'm allowing it to close this issue.) |
Alright! Sorry for the inconvenience... |
gadfly has submodules too, and #59 helps a bit. thanks. sadly, i still don't see much of an improvement in time-to-first-plot with additional precompile directives. |
(EDITED) Are you also adding the precompiles to Gadfly's dependencies? I am not sure what happens if the inference result for the caller is kept but discarded from the callee; it seems possible to me that this would confuse inference and it would just decide to re-infer everything from scratch. So you might have to make sure you work your way up from the bottom. One thing I notice about Gadfly's design is that it has a lot of types: I count 87 instances of |
indeed, Gadfly has a lot of structs, and uses them to fully embrace multiple dispatch. i've boiled down one use of a struct to a minimum example which does not precompile:
when i
this is output both times:
it is all in one module, no reference to anything in another module is made, there are no submodules, how could there possibly be any ambiguity about the types of the input arguments or return value? |
That might be an important observation, but also might not. I've noticed that coverage misses such lines sometimes too. What if you have a more complicated function that calls Also beware the interpreter, which can run code without inferring it first; it's safest to put your "snoop script" in a file ( |
despite Plots.jl's recent 2x reduction in first-time-to-plot using SnoopCompile, for Gadfly i can only achieve a 25% speedup. i presume this is because "there are some significant constraints that sometimes prevent Julia from saving even the inference results". would be nice to know ways Gadfly's source code could be refactored instead. is it worth itemizing in the SnoopCompile docs the specific cases for which julia can't save the inference results? or are things too much in flux now with the ongoing efforts to improve the compiler??
The text was updated successfully, but these errors were encountered: