-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiler performance #14743
Comments
I've been thinking about this. We could try hashing the IR code, but we'd have to do some work to avoid spurious differences due to naming things, etc., e.g. we could name all functions after a hash of their IR. Of course this'll also seriously complicate backtraces/debug info. |
Would this be mitigated if we started by only considering different specializations of the exact same method? I imagine we could do a reasonably quick experiment to see if this might be profitable. |
Yes for backtraces, no for debug info, but I think it might be fixable. |
The approach I had contemplated was replacing actual debug info with some sort of template values and then when you get a cache hit, use the previously generated code but with the debug info "template" filled in. Not sure how well that could be made to work though. |
I think the biggest problem is to know which of the specializations you're in while walking the stack. You could potentially do it by looking at the local variables of the parent frame and the trying to figure out which one would have had to have been called. |
What I'm was describing would result in different specialized versions (with different debug info), but would reuse the generated code, so it would save time but not memory. Of course, that's not as good as using the same generated code, but that seems much harder. |
Ah, I understand |
Wasn't gambit a bit buggy when we first tried it in the very early days? I guess it should be easy to try it out and run PkgEvaluator. |
An flisp to llvm bytecode compiler could also be a great JSOC project. We need to announce JSOC soon too. |
I think that compiler performance is a little too important to hang our hopes on a JSoC project. |
Of course we wouldn't hang our hopes on it, but there is no harm in mentioning it as a potential candidate project - in case we don't get around to doing it. |
Is there any update on which solution will be given to improve flisp performance? |
Check out https://github.com/JuliaLang/julia/pulls?q=is%3Apr+author%3AJeffBezanson+is%3Aclosed for some of Jeff's PRs which have already implemented some of the solutions. |
On my laptop, It might be related to #16434 and, therefore, we should probably also look into the effects of splitting up the function. It might be much faster to compile six smaller versions. |
With a quick look, a significant amount of compile time for |
Still to go here: #16837 |
There's also an especially bad case in #17137 we should fix. |
"Try using Gambit-C again", "Scheme" wasn't obvious (well I guess implied by flisp..): https://en.wikipedia.org/wiki/Gambit_(scheme_implementation) Is FemtoLisp on the way out? If/when this works? I see recent issues on a REPL for it.. |
@PallHaraldsson This is just adding noise by asking such questions here. Best to do it on julia-users. |
Doesn't seem to be anything left on this list worth doing / tracking with a meta issue. |
This is a tracking issue for work on speeding up the compiler itself. Between LLVM 3.7 and the upcoming jb/functions we have significant slowdowns. Dealing with this is becoming quite urgent. All phases of the system could use improvement.
Front end
Clean up lowering passes (julia-syntax.scm). Probably at least 2-3 of them can be combined or removed.(simplify and speed up front end #14997)IR
AST representation needs to be more compact and include better debug info improved IR #15609, Improve inlined line numbers #14949, WIP: overhaul file name info #15583More efficient Slot representation (separate Slot into SlotNumber and TypedSlot to save space #15951)Type inference
Use workqueue instead of recursion type-inference workq #15300More efficient lookup structure for cached inferred trees TupleMap type #15779inline_worthy
after inference and cache it #15970)Method-cache-style widening before invoking recursive inference, to cut down workload(add a lattice element type for constants in inference #15785)Const
lattice elementcombine tfunc and specializations arrays (merge specializations and tfunc #15918)allow type inf to always allocate new LambdaInfos to avoid copies in bothspecializations
and method cacheOther
There is sometimes a regression due to precompile+ #15934believed to be largely fixedCodegen
Codegen time might be slightly super-linear in total amount of code (see set JULIA_TEST_MAXRSS_MB=600 on appveyor #14845 (comment)) some codegen tests & fixes #15632Quadratic jit debug info registration (jit code debug registration is O(n^2) #14846) gdb bug, not JuliaAdd -O0 optionUse less memory (Approaches for avoiding fragmentation in code memory allocation #14626)calling convention for constant functions that fully avoids codegen (RFC: specialized calling convention for pure functions that return a constant #16837)Some specific issues:
function AST in a module limited to 2^16 constants #14113 AST representation with many constantsegal
bottleneckThe text was updated successfully, but these errors were encountered: