Skip to content

Conversation

LilithHafner
Copy link
Member

Let's see if we can keep runtime performance while getting rid of a major cause of compile time performance issues.

  • Set max methods to 1 (mostly disable world splitting)
  • Also disable word splitting when the matches are not fully covered (regardless of number of methods) [Using Claude]

@LilithHafner LilithHafner added the compiler:inference Type inference label Jul 24, 2025
@KristofferC
Copy link
Member

KristofferC commented Jul 24, 2025

See #48320 for an earlier attempt. Specifically #48320 (comment) for the breakage caused by imprecisely inferring the element type of empty arrays.

@topolarity
Copy link
Member

topolarity commented Jul 24, 2025

This data / test is great (always good to test and confirm expectations!)

FWIW, I thought about this a bit more after https://pretalx.com/juliacon-2025/talk/DCDEQV/ and I'm pretty convinced that we need this optimization.. Here's my argument:

As is, the compiler when "world-splitting" a wide (largely uncovered) function call has two cases:

  1. the "wide" call will not see new method definitions (it is effectively final)
  2. the "wide" call will see new method definitions (it is effectively an interface)

In the case of (1), we really want to optimize, since otherwise we are leaving free (and substantial) inference gains on the table (e.g. call returns concrete type vs. Any). In the case of (2), we do not want to optimize, because later after invalidations, we'll give up on optimization soon anyway.

My concern is that disabling world-splitting means basically always assuming that (2) is the situation, which implies that we are pretty much always leaving substantial inference gains on the table whenever the reality is (1)

@aviatesk
Copy link
Member

aviatesk commented Jul 25, 2025

I was thinking during your presentation that it might be better to do this deoptimization combined with the method table freezing mechanism implemented in #56143. That means optimizing with max_methods=3 only for method calls where the method table is frozen, and using max_methods=1 by default.

However, I think it's hard to freeze map or broadcasting anyway, so just doing that might not cover the breakage @KristofferC shared.

To find a good heuristic, we probably need to investigate in detail the inference conditions required to cover those packages as well. I don't think just tweaking max_methods will be enough.

@LilithHafner
Copy link
Member Author

@nanosoldier runtests()

Mostly to see runtime of pkgeval

@topolarity
Copy link
Member

topolarity commented Jul 25, 2025

Oof, 17 failures for --trim hello, world:

Trim verify finished with 17 errors, 0 warnings.

A lot of that would be improved by making sure we world-split for any anonymous function types (e.g. closures + inner functions for args expansion). Anonymity means that these are pretty much never extended so they should be treated as final

@LilithHafner
Copy link
Member Author

LilithHafner commented Jul 25, 2025

We need a good heuristic for

Will this call see no more new methods?


Candidates:

  1. Less than 4 matching methods (release, master)
  2. Exactly one matching method whose type signature fully covers the possible argument types (pr)
  3. Only if the argument types are concrete (even more conservative than pr, eliminates pretty much all non-piracy invalidations)
  4. If the function is anonymous
  5. If the function has been marked final
  6. If the callsite has been marked @worldsplit (or maybe we want to call it @static_dispatch)
  7. 2 | 4 | 5 | 6
  8. 3 | 4 | 5 | 6
  9. 2 | 4

@nsajko
Copy link
Member

nsajko commented Jul 25, 2025

Disabling world-splitting or even just changing the heuristic is quite heavy-handed. Meanwhile, safe approaches, like outlined in a comment of mine on PR #58788, still haven't been tried. I believe I actually got a branch locally that implements that but I haven't pushed it as I was waiting on that PR getting merged.

@nsajko
Copy link
Member

nsajko commented Jul 25, 2025

Furthermore, as I discuss there, setting max_methods to one is not appropriate for a wide class of functions, those which take types as arguments and need to have a separate method for the bottom type, because the bottom type subtypes each type so the method will appear in method queries even when it's not strictly relevant.

@LilithHafner
Copy link
Member Author

@nsajko, is f/g the "wide class of functions" you are referring to? Apparently there is a union splitting optimization for types that this PR fails to disable* which should make that a non-issue.

x@fedora:~/.julia/dev/julia$ ./julia --startup=no
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _' |  |
  | | |_| | | | (_| |  |  Version 1.13.0-DEV.895 (2025-07-23)
 _/ |\__'_|_|_|\__'_|  |  lh/max-methods-1/1450d944c3* (fork: 1 commits, 1 day)
|__/                   |

julia> f(::Type{Int}) = 1
f (generic function with 1 method)

julia> f(::Type{Union{}}) = 0.0
f (generic function with 2 methods)

julia> g(x) = f(x[1])
g (generic function with 1 method)

julia> Base.infer_return_type(g, Tuple{Vector{Union{Type{Union{}}, Type{Int}}}})
Union{Float64, Int64}

julia> f2(x::Int) = 1
f2 (generic function with 1 method)

julia> f2(x::Float64) = :a
f2 (generic function with 2 methods)

julia> g2(x) = f2(x[1])
g2 (generic function with 1 method)

julia> Base.infer_return_type(g, Tuple{Vector{Union{Int, Float64}}})
Any

*It'd be reassuring if someone could point out where that optimization takes place.

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

Report summary

❗ Packages that crashed

6 packages crashed only on the current version.

  • An internal error was encountered: 5 packages
  • GC corruption was detected: 1 packages

6 packages crashed on the previous version too.

✖ Packages that failed

159 packages failed only on the current version.

  • Package fails to precompile: 28 packages
  • Illegal method overwrites during precompilation: 1 packages
  • Package has test failures: 23 packages
  • Package tests unexpectedly errored: 98 packages
  • Package is using an unknown package: 1 packages
  • Networking-related issues were detected: 1 packages
  • Tests became inactive: 2 packages
  • Test duration exceeded the time limit: 5 packages

1195 packages failed on the previous version too.

✔ Packages that passed tests

31 packages passed tests only on the current version.

  • Other: 31 packages

5274 packages passed tests on the previous version too.

~ Packages that at least loaded

26 packages successfully loaded only on the current version.

  • Other: 26 packages

3176 packages successfully loaded on the previous version too.

➖ Packages that were skipped altogether

907 packages were skipped on the previous version too.

@LilithHafner
Copy link
Member Author

PkgEval is both very good and very bad. I estimate just over 100 reproducible failures on this PR that are not reproducible on master (all of which, I imagine, are due to packages that depend on type inference results being stable across Julia versions). I also estimate that PkgEval (a proxy for compile time + a shred of runtime, with bias from tests that fail fast) is 3x faster on this branch than master.

While preliminary results, 3x compile perf vs. breaking lots of folks (1% of our 10k registered packages) who depend on internals points to making some efforts to reduce breakage (esp. where we leak those internals from Base in returning empty arrays) and then proceeding.

@LilithHafner
Copy link
Member Author

Just looked through a nonrandom sample of 7 of the new failures and 6 of them were literally testing inference results or performance or display based derivatives of inference results and one had a MethodError from a Vector{Any} (also dependent on inference results, but at least a semantic failure). This final category is the highest priority to fix prior to merging.

@KristofferC
Copy link
Member

Just looked through a nonrandom sample of 7 of the new failures and 6 of them were literally testing inference results or performance

Just to state the obvious but consider the scenario where inference is completely turned off. That would:

  • only break packages that test inference or rely on inference somehow.
  • likely make PkgEval a lot faster (tests are dominated by compilation time).

But that doesn't make it a good idea.

@LilithHafner
Copy link
Member Author

Yes, obviously. This PR doesn't hurt runtime of concretely typed code, though.

@timholy
Copy link
Member

timholy commented Jul 28, 2025

Might want to benchmark JuliaInterpreter. It's already too slow, but if this makes it really too slow that would not be a good outcome.

@LilithHafner
Copy link
Member Author

Here's the JuliaInterpreter.jl benchmarks

julia> judge(pr, nightly).benchmarkgroup
7-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "recursive other 1_000" => TrialJudgement(+32.02% => regression)
  "recursive self 1_000" => TrialJudgement(+33.79% => regression)
  "throw long 1_000" => TrialJudgement(-34.07% => improvement)
  "long function 5_000" => TrialJudgement(+38.60% => regression)
  "tight loop 10_000" => TrialJudgement(+23.62% => regression)
  "ccall library" => TrialJudgement(+4.85% => invariant)
  "ccall ptr" => TrialJudgement(+21.08% => regression)

@timholy
Copy link
Member

timholy commented Jul 29, 2025

It's definitely a hit, but not as bad as I feared.

@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented Aug 18, 2025

Furthermore, as I discuss there, setting max_methods to one is not appropriate for a wide class of functions

Is:

Set max methods to 2

a middle ground? I'm trying to understand effect of 1 vs 3 (vs 2; or 4 eliminated from PR #58788). This seems like an optimization and should not break code?! Only is breaking some bad tests/packages using internals? Anything that "fixes (almost) all invalidation" seems like a great thing (even for 1.12 if possible..., since a small change, that could easily be undone).

@oscardssmith
Copy link
Member

is Set max methods to 2 a middle ground?

Not really. There's somewhat of a qualitative difference between 1+fully covers and 2. The difference is that the first makes it so there is no invalidation without piracy, while the 2nd doesn't.

nsajko added a commit to nsajko/julia that referenced this pull request Aug 24, 2025
A function which is meant to have methods added to it by users should
have a small `max_methods` value, as world-splitting just leads to
unnecessary invalidation in that case, in the context of the package
ecosystem, where users are allowed to add arbitrarily many methods to
such functions.

xref PR JuliaLang#57884

xref PR JuliaLang#58788

xref PR JuliaLang#58829

xref PR JuliaLang#59091
nsajko added a commit to nsajko/julia that referenced this pull request Aug 27, 2025
A function which is meant to have methods added to it by users should
have a small `max_methods` value, as world-splitting just leads to
unnecessary invalidation in that case, in the context of the package
ecosystem, where users are allowed to add arbitrarily many methods to
such functions.

xref PR JuliaLang#57884

xref PR JuliaLang#58788

xref PR JuliaLang#58829

xref PR JuliaLang#59091
@adienes
Copy link
Member

adienes commented Aug 27, 2025

I'm phrasing this in vague terms since I don't know precisely what it would entail, but would it make sense to provide a separate "max_methods=1 build" that package maintainers can test with? in the same way that a package maintainer might run their CI tests on nightly, or debug builds, or with threads, or at -O1 etc.

I'm envisioning that giving a longer onramp to observe & handle any potential breakages, if the fear of disruption is too much to just put this into nightly

@nsajko nsajko added the DO NOT MERGE Do not merge this PR! label Aug 27, 2025
nsajko added a commit to nsajko/julia that referenced this pull request Aug 28, 2025
* Introducing new types and methods for a callable can invalidate
  already compiled method instances of a function for which
  world-splitting is enabled (`max_methods`).

* Invalidation of sysimage or package precompiled code worsens latency
  due to requiring recompilation.

* Lowering the `max_methods` setting for a function often causes
  inference issues for existing code that is not completely
  type-stable (which is a lot of code). In many cases this is easy to
  fix by avoiding method proliferation, such as by merging some methods
  and introducing branching into the merged method.

This PR aims to fix the latter issue for some `Tuple`-related methods
of some functions where decreasing `max_methods` might be interesting.

Seeing as branching was deliberately avoided in the bodies of many of
these methods, I opted for the approach of introducing local functions
which preserve the dispatch logic as before, without branching. Thus
there should be no regressions, except perhaps because of changed
inlining costs.

This PR is a prerequisite for PRs which try to decrease `max_methods`,
such as PRs:

* JuliaLang#59091

* JuliaLang#59377
nsajko added a commit to nsajko/julia that referenced this pull request Aug 29, 2025
A function which is meant to have methods added to it by users should
have a small `max_methods` value, as world-splitting just leads to
unnecessary invalidation in that case, in the context of the package
ecosystem, where users are allowed to add arbitrarily many methods to
such functions.

xref PR JuliaLang#57884

xref PR JuliaLang#58788

xref PR JuliaLang#58829

xref PR JuliaLang#59091
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:inference Type inference DO NOT MERGE Do not merge this PR! invalidations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants