Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTFP regression since 1.7 #43157

Closed
JeffBezanson opened this issue Nov 19, 2021 · 19 comments
Closed

TTFP regression since 1.7 #43157

JeffBezanson opened this issue Nov 19, 2021 · 19 comments
Labels
compiler:latency Compiler latency regression Regression in behavior compared to a previous version
Milestone

Comments

@JeffBezanson
Copy link
Member

1.7:

julia> @time using Plots
  4.262361 seconds (8.31 M allocations: 578.459 MiB, 8.05% gc time, 15.63% compilation time)

julia> @time display(plot(rand(10)))
  8.242615 seconds (17.50 M allocations: 934.041 MiB, 2.45% gc time, 99.32% compilation time)

master:

julia> @time using Plots
  4.669242 seconds (9.23 M allocations: 614.779 MiB, 7.41% gc time, 15.85% compilation time)

julia> @time display(plot(rand(10)))
  9.217127 seconds (19.85 M allocations: 1.052 GiB, 2.43% gc time, 96.14% compilation time)
@JeffBezanson JeffBezanson added regression Regression in behavior compared to a previous version compiler:latency Compiler latency labels Nov 19, 2021
@aviatesk
Copy link
Member

I guess this is a dup of #41914 (comment). In short @time using xyz is expected to report very different results between v1.7 and master, since now @time and its family force compilation of the applied block while they didn't before.

@martinholters
Copy link
Member

I also see a regression with time julia -e 'using Plots; display(plot(rand(10)))', somewhere between 5% and 10%, which we cannot blame on @time.

@KristofferC
Copy link
Member

I guess this is a dup of #41914 (comment).

Well, it turned out that that was an actual regression (#42379).
Also, if Julia compiled more upfront (before the timing starts) then the timing should be lower?

@oscardssmith oscardssmith added this to the 1.8 milestone Jan 24, 2022
@oscardssmith
Copy link
Member

oscardssmith commented Jan 24, 2022

As of today's master, I'm seeing a regression from 9.6 seconds to 12.7 seconds. We should try very hard to address this before releasing 1.8.

@vchuravy
Copy link
Member

Can you run a bisect?

@oscardssmith
Copy link
Member

What's the syntax? (I haven't run one before)

@vchuravy
Copy link
Member

https://www.metaltoad.com/blog/beginners-guide-git-bisect-process-elimination

You start with:

git bisect start
git bisect bad HEAD
git bisect good v1.7.0

@oscardssmith
Copy link
Member

I haven't managed to bisect it well. the time seems somewhat stochastic, and there appear to be a variety of regressions and improvements mixed together which make it very difficult to hone in on the correct spot.

@timholy
Copy link
Member

timholy commented Jan 24, 2022

May not be relevant, but the number of unique invalidations are:

  • 1.7: 1285
  • 1.8: 2622

precompile_blockers doesn't seem to suggest these count for a lot of re-inference time, but it's possible that compilation takes longer.

@oscardssmith
Copy link
Member

Is that expected? That seems really surprising to me.

@timholy
Copy link
Member

timholy commented Jan 24, 2022

No, though it will take some digging to figure out why. From a brief glance, it looks like Base.require(::PkgId) is on the invalidation list, which is really bad. (That won't affect plotting time, just loading time.)

@timholy
Copy link
Member

timholy commented Jan 25, 2022

xref #43939

@timholy
Copy link
Member

timholy commented Jan 26, 2022

I looked at the residuals after #43939. This line accounts for much of the difference between 1.7 and nightly: if I comment out the <= v"1.7.9", then I get

julia 1.7:

julia> @time using Plots
  4.177945 seconds (9.68 M allocations: 661.563 MiB, 7.33% gc time, 29.76% compilation time)

julia> @time display(plot(rand(10)))
  7.644650 seconds (18.80 M allocations: 1008.568 MiB, 2.63% gc time, 95.29% compilation time)

nightly, with #43939 added:

julia> @time using Plots
  4.456418 seconds (9.87 M allocations: 652.349 MiB, 8.53% gc time, 23.08% compilation time)

julia> @time display(plot(rand(10)))
  6.403581 seconds (16.56 M allocations: 911.363 MiB, 2.43% gc time, 94.51% compilation time)

so I think we can safely say this is fixed once that merges.

This may be an opportune moment to make two points:

  • if we're actually serious about latency reduction (are we?), relying entirely on packages is dicey because those packages might do things that confound comparisons (as was the case here). @aviatesk's program to start real inference benchmarks seems like an important start, though we should probably expand our coverage since latency is more than just inference performance. While improving inference & codegen performance may be an esoteric art, adding benchmarks to a suite is not. I will start looking for slowpokes among SnoopCompile's reports (e.g., https://timholy.github.io/SnoopCompile.jl/stable/reference/#SnoopCompile.flatten and/or https://github.com/timholy/SnoopCompile.jl/blob/master/src/parcel_snoopl.jl for the LLVM time), perhaps others can do the same.
  • despite Plots' extensive precompile directives, 33% of the TTFP is still inference, and theoretically this should be eliminated once the full PR series currently at Add relocatable root compression #43881 (PR 2 of an anticipated 3) gets merged. The savings for Makie should theoretically be about 50% of the TTFP. I don't have the final PR submitted or even working (lots of segfaults), but I'm optimistic that it's close, and getting some better-informed eyeballs on this work might help (particularly for the last one not yet submitted, but a glance at the current one would be appreciated if possible).

@giordano
Copy link
Contributor

Why is that VERSION <= v"1.7.9"?? 🔮

@timholy
Copy link
Member

timholy commented Jan 26, 2022

Yeah, that's a little strange. Presumably they meant something like < v"1.8.0-DEV".

@KristofferC
Copy link
Member

Why is that VERSION <= v"1.7.9"?? crystal_ball

Do you think there will be a v1.7.10? ;)

@timholy
Copy link
Member

timholy commented Jan 26, 2022

When you have @KristofferC on your team, the impossible becomes possible.

@giordano
Copy link
Contributor

VERSION <= v"1.6.9" is more at risk of actually becoming false at some point 😛

@timholy
Copy link
Member

timholy commented Jan 26, 2022

Closed by #43939

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

8 participants