Optimize production binaries #9224

T-256 · 2023-12-21T06:00:44Z

#9031 Improved compile-time, but it also affects runtime, this issue opened to create a separated build profile for production.

Note that current release profile is used for benchmarking (comparing Ruff to itself) and new production profile supposed to have better runtime and comparing Ruff to foreign tools. We can document when which profile should use.

By separating these profiles, perhaps we could enhance current release profile to have even faster compile-time.

The text was updated successfully, but these errors were encountered:

BurntSushi · 2023-12-21T12:35:16Z

Thanks for filing this issue! I remember you mentioning this in #9031.

I would like to push back a bit here.

IMO, the point of release isn't just to be fast to compile. It's to balance between runtime performance and compile time. We don't want to chase the road of faster compiles too far. We already have a mode that is meant to compile as quickly as possible (the dev and test profiles). As I said in that PR, the issue with having two different configurations is that it becomes very easy to accidentally benchmark and profile the wrong thing, and if not that, you just wind up back with high iteration times. This is important because LTO greatly effects function inlining, and inlining is really the mother of all optimizations. They are doors to even more optimization. It is typically the case, although not always, that many of the benefits of LTO can be unlocked with well placed #[inline] or #[inline(always)] hints. (This was also discussed in the PR.)

Profiling the wrong thing can be maddening when trying to fix a regression in the benchmark results. For example, if our codspeed benchmarks are run with the production profile, but one ends up trying to reproduce that locally with the release profile, you might wind up chasing the wrong thing. We could have a production profile and not use it in codspeed, but now you aren't tracking performance of the actual final artifact shipped to end users. Instead, it's something different. And if we do use production in codspeed, then you'll wind up needing to use the production profile to chase down regressions, which puts us back at square one: very long iteration times. With that said, not every performance problem with be obscured if one uses two different profiles. It's likely only some will be obscured. So even if I benchmark using the release profile while we ship the production profile, it isn't guaranteed that it will end in sadness. (I say this to make sure I'm not over-stating the downsides here.)

My personal feeling here is to not add a separate production profile. Or at least, let's see how far we can get without adding one. I don't feel like it would be the worst problem in the world to have a separate profile personally, so I don't mean to say that I'll always be opposed to it. But I would like to give the current configuration a try.

T-256 · 2023-12-21T13:52:57Z

We could have a production profile and not use it in codspeed, but now you aren't tracking performance of the actual final artifact shipped to end users.

@BurntSushi Thanks for your response, if I didn't understand correctly, please tell me.
We use codspeed to check speed regression or improvements for PRs, so comparing them on same release profile doesn't affect on speed diffs.
Here is build profiles usecases would like:

`release` profile:

CI for PRs, Commits
Benchmark for PRs, Commits

`production` profile

End-users
Benchmark for publish releasing (e.g. different between 0.1.9 and 0.1.8)
Doc's reference for run-time measured numbers

Devs can easily use production to put the measured benchmarks into docs, Also, cospeed shows speed (%) diffs between GH releases.

So even if I benchmark using the release profile while we ship the production profile, it isn't guaranteed that it will end in sadness.

By that said, end-users almost don't feel anything, there would be a little confusing for contributors/local-compilers that can be solved by Building section in the docs.

Or at least, let's see how far we can get without adding one.

How would you check how much is the performance difference between current release profile and the optimized one (production)? perhaps a ready to test profile could help, like profiling build profile.

If decided to avoid using production nowhere, I'd recommend make it ready to be option for volunteers to have it (needs some helps on docs)

zanieb · 2023-12-21T14:53:04Z

Benchmarking is not just about CI, as developers we need to be able to run benchmarks locally while optimizing code. If we're not using the same profile as we will eventually release to users, we could make the wrong optimizations. A goal in #9031 was to improve the developer experience and reduce the amount of time it takes to work on performance improvements. We believe this will yield more significant improvements in the long run.

As @BurntSushi said: if we have concerns after using this profile for a while, we can consider adding a more optimized profile. In the meantime, I think the performance regression from #9031 is not significant enough to warrant the additional complexity.

MichaReiser · 2024-01-26T07:30:51Z

I close this issue considering that we haven't received any user feedback that ruff got noticeably slower for them after releasing #9031

We can revisit more involved optimisations of our release binary in the future (and also explore e.g. profiling driven optimisations etc).

charliermarsh added the release Related to the release process label Jan 2, 2024

MichaReiser closed this as completed Jan 26, 2024

MichaReiser closed this as not planned Won't fix, can't repro, duplicate, stale Jan 26, 2024

T-256 mentioned this issue Aug 8, 2024

Enable LTO optimizations in release builds to reduce binary size astral-sh/uv#5904

Merged

davfsa mentioned this issue Aug 8, 2024

Optimize for size in release builds astral-sh/uv#5909

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize production binaries #9224

Optimize production binaries #9224

T-256 commented Dec 21, 2023

BurntSushi commented Dec 21, 2023

T-256 commented Dec 21, 2023

zanieb commented Dec 21, 2023

MichaReiser commented Jan 26, 2024

Optimize production binaries #9224

Optimize production binaries #9224

Comments

T-256 commented Dec 21, 2023

BurntSushi commented Dec 21, 2023

T-256 commented Dec 21, 2023

release profile:

production profile

zanieb commented Dec 21, 2023

MichaReiser commented Jan 26, 2024

`release` profile:

`production` profile