Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-101282: Apply BOLT optimisations to libpython for shared builds #104709

Merged
merged 4 commits into from
May 22, 2023

Conversation

indygreg
Copy link
Contributor

@indygreg indygreg commented May 20, 2023

(This change is a quick and dirty way to merge some of the build system improvements I'm proposing in gh-101093 before the 3.12 feature freeze. I wanted to scope bloat myself to fix some longstanding deficiencies in the build system around profile-guided builds. But I'm getting soft resistance to the reviews so close to the freeze deadline and it is obvious that we need a simpler solution to hit the 3.12 deadline. While this change is quick and dirty, it attempts to not make things worse.)

Before this change, we only applied bolt to the main python binary. After this change, we apply bolt to libpython if it is configured. In shared library builds, most of the C code is in libpython so it is critical to apply bolt to libpython to realize bolt benefits.

This change also reworks how bolt instrumentation is applied. It effectively removes the readelf based logic added in gh-101525 and replaces it with a mechanism that saves a copy of the pre-bolt binary and restores that copy when necessary. This allows us to perform bolt optimizations without having to manually delete the output binary to force a new bolt run.

We also add a new make target for purging bolt files and hook it up to clean so bolt state is purged when appropriate.

.gitignore rules have been added to ignore files related to bolt.

Before and after this refactor, make will no-op after a previous run. Both versions should also share common make DAG deficiencies where targets fail to trigger as often as they need to or can trigger prematurely in certain scenarios. e.g. after this change you may need to rm profile-bolt-stamp to force a bolt run because there aren't appropriate non-phony targets for bolt's make target to depend on. Fixing this is a non-trivial amount of work that will likely have to wait until the 3.13 window.

To make it easier to iterate on custom BOLT settings, the flags to pass to instrumentation and application are now defined in configure and can be overridden by passing BOLT_INSTRUMENT_FLAGS and BOLT_APPLY_FLAGS.

Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update docs for BOLT_INSTRUMENT_FLAGS and BOLT_APPLY_FLAGS : https://docs.python.org/3.12/using/configure.html#performance-options

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.

(This change is a quick and dirty way to merge some of the build system
improvements I'm proposing in pythongh-101093 before the 3.12 feature freeze.
I wanted to scope bloat myself to fix some longstanding deficiencies in
the build system around profile-guided builds. But I'm getting soft
resistance to the reviews so close to the freeze deadline and it is
obvious that we need a simpler solution to hit the 3.12 deadline. While
this change is quick and dirty, it attempts to not make things worse.)

Before this change, we only applied bolt to the main python binary.
After this change, we apply bolt to libpython if it is configured. In
shared library builds, most of the C code is in libpython so it is
critical to apply bolt to libpython to realize bolt benefits.

This change also reworks how bolt instrumentation is applied. It
effectively removes the readelf based logic added in pythongh-101525 and
replaces it with a mechanism that saves a copy of the pre-bolt binary
and restores that copy when necessary. This allows us to perform
bolt optimizations without having to manually delete the output binary
to force a new bolt run.

We also add a new make target for purging bolt files and hook it up
to `clean` so bolt state is purged when appropriate.

`.gitignore` rules have been added to ignore files related to bolt.

Before and after this refactor, `make` will no-op after a previous run.
Both versions should also share common make DAG deficiencies where
targets fail to trigger as often as they need to or can trigger
prematurely in certain scenarios. e.g. after this change you may need
to `rm profile-bolt-stamp` to force a bolt run because there aren't
appropriate non-phony targets for bolt's make target to depend on.
Fixing this is a non-trivial amount of work that will likely have to
wait until the 3.13 window.

To make it easier to iterate on custom BOLT settings, the flags to
pass to instrumentation and application are now defined in configure
and can be overridden by passing `BOLT_INSTRUMENT_FLAGS` and
`BOLT_APPLY_FLAGS`.
@indygreg
Copy link
Contributor Author

Please update docs for BOLT_INSTRUMENT_FLAGS and BOLT_APPLY_FLAGS : https://docs.python.org/3.12/using/configure.html#performance-options

Done in latest push.

Copy link
Contributor

@erlend-aasland erlend-aasland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned up the docs and AC code; hope you don't mind.

I left some questions. Regarding BOLT technical stuff, I lean on Dong-hee's review.

.gitignore Show resolved Hide resolved
Makefile.pre.in Show resolved Hide resolved
Makefile.pre.in Show resolved Hide resolved
@erlend-aasland
Copy link
Contributor

erlend-aasland commented May 21, 2023

Done in latest push.

We appreciate if you don't force-push:

  • It does not play very nice with the GitHub UX (messes up CI runs, commit history, review comments, etc.)
  • We often collaborate on PRs; pulling in new changes using git merge --no-ff is more friendly to our workflow

(This is also mentioned in the devguide.)

@erlend-aasland erlend-aasland changed the title gh-101282: rework the BOLT build process gh-101282: Apply BOLT optimisations to libpython for shared builds May 22, 2023
@erlend-aasland erlend-aasland merged commit 5360cb3 into python:main May 22, 2023
@erlend-aasland
Copy link
Contributor

Thanks, Greg and Dong-hee!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants