-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inference: remove throw
block deoptimization completely
#49260
Conversation
throw
block deoptimization completelythrow
block deoptimization completely
@nanosoldier |
ffba552
to
e7dee0e
Compare
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
I would like to remove it, but this seems surprisingly strongly negative on our benchmarks. Can we do something to reduce that, or is that just some sort of artifact of being a small test case? |
Yeah, there seems to be profitability for small cases. I will add more examples to BaseBenchmarks.jl (by using existing packages) and see what it will say. |
bc32263
to
ee45c04
Compare
I think we should merge this. Now that tab complete relies on proving termination and effectfree-ness, this PR also makes tab complete better. |
e7dee0e
to
a24372e
Compare
ee45c04
to
81287f2
Compare
a24372e
to
07165d7
Compare
@nanosoldier |
Your job failed. |
07165d7
to
3ae3529
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
c8a5046
to
d77836e
Compare
3ae3529
to
7328bb3
Compare
7328bb3
to
b08925c
Compare
@nanosoldier |
b08925c
to
808bae1
Compare
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
808bae1
to
906bf12
Compare
Although benchmarks suggest that this is indeed an effective optimization, this optimization has been quite incomplete from the beginning and more importantly started to cause numerous issues in other projects. So I will prioritize the development efficiency of those projects and remove this optimization nevertheless. We could try to come up with an alternative solution to improve latency for those code paths, but it doesn't seem to be a good idea to leave this situation as is. |
After experimenting with #49235, I started to question if we are getting any actual benefit from the `throw` block deoptimization anymore. This commit removes the deoptimization from the system entirely. Based on the numbers below, it appears that the deoptimization is not very profitable in our current Julia-level compilation pipeline, with the effects analysis playing a significant role in reducing latency. Here are the updated benchmark: | Metric | master | #49235 | this commit | |-------------------------|-----------|-------------|--------------------------------------------| | Base (seconds) | 15.579300 | 15.206645 | 15.42059 | | Stdlibs (seconds) | 17.919013 | 17.667094 | 17.404586 | | Total (seconds) | 33.499279 | 32.874737 | 32.826162 | | Precompilation (seconds) | 53.488528 | 53.152028 | 53.152028 | | First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` | | First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` | [^1]: With disabling precompilation of Plots.jl. [^2]: With disabling precompilation of OrdinaryDiffEq.
2853f37
to
af93df0
Compare
CI is green after #55356 - Good to merge? |
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
Looks like a few inference regressions, but nothing too surprising. IMO this is good to merge. |
Due to JuliaLang/julia#49260, the constructor is now being inlined, so we need to check for `Expr(:new)` and not just `:call`.
…49260) Co-authored-by: Cody Tapscott <topolarity@tapscott.me> Co-authored-by: Oscar Smith <oscardssmith@gmail.com>
This causes a weird inference regression, see #55583. |
After investigating the cause of the invalidation reported in #55583, it was found that the issue arises only when `r` is propagated as an extended lattice element, such as `PartialStruct(UnitRange{Int}, ...)`, for the method of `getindex(::String, r::UnitRange{Int})`. Specifically, the path at https://github.com/JuliaLang/julia/blob/cebfd7bc66153b82c56715cb1cb52dac7df8eac8/base/compiler/typeinfer.jl#L809-L815 is hit, so the direct cause was the recursion limit for constant inference. To explain in more detail, within the slow path of `nextind` which is called inside `getindex(::String, ::UnitRange{Int})`, the 1-argument `@assert` is used https://github.com/JuliaLang/julia/blob/cebfd7bc66153b82c56715cb1cb52dac7df8eac8/base/strings/string.jl#L211. The code related to `print` associated with this `@assert` further uses `getindex(::String, ::UnitRange{Int})`, causing the recursion limit. This recursion limit is only for constant inference, which is why we saw this regression only for the `PartialStruct` case. Moreover, since this recursion limit occurs within the `@assert`-related code, this issue did not arise until now (i.e. until #49260 was merged). As a solution, I considered improving the recursion limit itself, but decided that keeping the current code for the recursion limit of constant inference is safer. Ideally, this should be addressed on the compiler side, but there is certainly deep recursion in this case. As an easier solution, this commit resolves the issue by changing the 1-arg `@assert` to the 2-arg version. - replaces #55583
After investigating the cause of the invalidation reported in #55583, it was found that the issue arises only when `r` is propagated as an extended lattice element, such as `PartialStruct(UnitRange{Int}, ...)`, for the method of `getindex(::String, r::UnitRange{Int})`. Specifically, the path at https://github.com/JuliaLang/julia/blob/cebfd7bc66153b82c56715cb1cb52dac7df8eac8/base/compiler/typeinfer.jl#L809-L815 is hit, so the direct cause was the recursion limit for constant inference. To explain in more detail, within the slow path of `nextind` which is called inside `getindex(::String, ::UnitRange{Int})`, the 1-argument `@assert` is used https://github.com/JuliaLang/julia/blob/cebfd7bc66153b82c56715cb1cb52dac7df8eac8/base/strings/string.jl#L211. The code related to `print` associated with this `@assert` further uses `getindex(::String, ::UnitRange{Int})`, causing the recursion limit. This recursion limit is only for constant inference, which is why we saw this regression only for the `PartialStruct` case. Moreover, since this recursion limit occurs within the `@assert`-related code, this issue did not arise until now (i.e. until #49260 was merged). As a solution, I considered improving the recursion limit itself, but decided that keeping the current code for the recursion limit of constant inference is safer. Ideally, this should be addressed on the compiler side, but there is certainly deep recursion in this case. As an easier solution, this commit resolves the issue by changing the 1-arg `@assert` to the 2-arg version. - replaces #55583
* Generalize symbol type for debug scopes * More scope adjustment * Adjust to JuliaLang/julia#49260 * More Core.Compiler adjustments
After experimenting with #49235, I started to question if we are getting any actual benefit from the
throw
block deoptimization anymore.This commit removes the deoptimization from the system entirely.
Based on the numbers below, it appears that the deoptimization is not very profitable in our current Julia-level compilation pipeline, with the effects analysis playing a significant role in reducing latency.
Here are the updated benchmark:
plot(rand(10,3))
13.432983 seconds (16.55 M allocations)
3.477767 seconds (16.45 M allocations)
3.539117 seconds (16.43 M allocations)
solve(prob, QNDF())(5.0)
24.628278 seconds (15.74 M allocations)
4.609222 seconds (15.32 M allocations)
4.547323 seconds (15.19 M allocations: 823.510 MiB)
Footnotes
With disabling precompilation of Plots.jl. ↩
With disabling precompilation of OrdinaryDiffEq. ↩