[WIP] Various build system improvements #101093

indygreg · 2023-01-16T23:33:41Z

I'm creating this PR to track various build system changes that I'd like to make to CPython.

I'm creating a draft PR instead of opening real PRs because general OSS experience has taught me that if I trickle things out slowly without any prior context project maintainers won't see the big picture and/or benefits of the changes. My hope is that by tracking all proposed changes in a Git branch that people can easily diff that I'll have an easier time convincing folks to accept the patches.

Next Steps

CPython maintainer(s) should look at each commit in this branch. I write detailed commit messages. See each commit for more details.

Commits beginning with gh-xxxx: aren't yet assigned issues/PRs.

Then, just tell me which commits to turn into issues/PRs and I'll do so. Keep in mind some commits are dependent on others. For example, the BOLT enhancements (which appear to deliver meaningful pyperformance wins) depends on the overhaul build rules for optimized binaries commit and its precursors.

I have several more potential patches to include in this branch. Including a bunch that enable cross-compiling for Apple platforms. I figured I'd start with just the build system + BOLT improvements to get people's attention.

Most of this work is derived from my efforts on https://github.com/indygreg/python-build-standalone.

cc @corona10 since you seem to have an interest in all the BOLT PRs.

corona10 · 2023-01-17T01:30:07Z

@indygreg cc @gvanrossum

Due to the holiday weekend in Korea (Lunar new year), I will review relevant PRs by next weekend.
Thanks for submitting these PRs :)

aaupov · 2023-01-18T00:43:54Z

Sidenote: BOLT now supports DWARF5, so there's no need to force -gdwarf-4

https://github.com/python/cpython/blob/main/configure.ac#L1950-1952

corona10 · 2023-01-18T01:51:59Z

Sidenote: BOLT now supports DWARF5, so there's no need to force -gdwarf-4

Nice news! Thanks for let us know :)

@indygreg, Let's drop the flag -gdwarf-4 it was only workaround approach
#95908 (comment)

cc @aaupov

indygreg · 2023-01-21T19:15:12Z

No rush on the reviews, @corona10: I'm also busy celebrating the Lunar New Year this weekend.

Some specific guidance I could use help with is how to split up / fold commits and how to associate them with a GitHub Issue. To be honest I find the whole CPython contribution process of managing GitHub Issues in addition to a PR to be confusing. Is there a 1:1 relationship? 1:n? https://devguide.python.org/ is a bit ambiguous on this topic.

I'm lazy/busy, so prefer as little overhead / separate GitHub Issues as possible. But I can go along with whatever. If someone tells me what to do I'll do it.

Also, work that I'd like to do but haven't started working on yet is enabling some of makesetup's features to be rolled into configure.ac. e.g. support for building/linking an extension statically. A lot of work in past ~1.5 years towards moving extension module config logic into configure.ac enables this. But I'm not sure if makesetup deprecation was an end goal. In my mind it is certainly possible to inline makesetup fully into configure.ac.

The end state I'd like to achieve is the ability for CPython's build system to build ~all extensions statically with minimal effort. e.g. ./configure --extensions-policy=all-static. I'd have to dig it up, but I believe there was a thread somewhere with some CPython maintainers in support of supporting easier static linking of extensions. I just don't know if anyone has opinions on how it should play out. makesetup still has lines that annotate back to Guido's commits made in 1994 and experience has taught me that it is best to ask before taking a sledge hammer to 25+ year old functionality!

corona10 · 2023-01-24T10:28:12Z

@indygreg Let's separate the PR with BOLT itself and configuration enhancement.
Please reference #101282 about the BOLT PRs.

corona10 · 2023-01-24T13:09:07Z

@indygreg

The end state I'd like to achieve is the ability for CPython's build system to build ~all extensions statically with minimal effort.

One more thing, I have also interested in supporting build all extensions statically with minimal effort for internal purposes.
(Yeah, I have use cases if this build is supported)
To support this, we need to write the PEP first. Can we discuss this through email?
My email is donghee.na@python.org

erlend-aasland · 2023-01-24T13:27:40Z

The end state I'd like to achieve is the ability for CPython's build system to build ~all extensions statically with minimal effort.

What wrong with just editing Modules/Setup.stdlib in order to achieve this?

erlend-aasland · 2023-05-08T07:54:38Z

@indygreg, are you planning on following up this PR?

indygreg · 2023-05-13T18:01:45Z

@indygreg, are you planning on following up this PR?

I could probably find time to polish off the PGO/BOLT changes if still wanted. Let's treat the broader extension module pieces I referred to as out of scope.

BTW, one of the things the stack of commits in this PR does is it adds BOLT optimization to libpython. Current main applies BOLT only to the python binary. It is unclear to me if the various CPython performance with BOLT threads I've read are based on the incomplete application of BOLT only to python. When I was developing these commits in January I did see some significant pyperformance changes after BOLT was applied to libpython. That makes sense: there's more code in libpython than in python.

erlend-aasland · 2023-05-14T20:54:33Z

It would be really helpful if you could split this up in separate issues for each isolated change, and for each issue write a few sentences about why the change is needed, what value it adds, etc. Applying BOLT to libpython makes a lot of sense to me (based on your analysis), but I have no experience when it comes to BOLT :)

indygreg · 2023-05-14T22:01:39Z

It would be really helpful if you could split this up in separate issues for each isolated change, and for each issue write a few sentences about why the change is needed, what value it adds, etc.

Yup. The PR summary called out that I was awaiting instructions/feedback on whether/how to proceed. I wasn’t sure if people were even interested in the refactors. All the details/justification are there in the commit messages. I just need to put in the time to turn the commits into issues/PRs. Maybe I’ll start later today…

erlend-aasland · 2023-05-14T22:04:41Z

All the details/justification are there in the commit messages.

That's great, but it is easier to interact using GitHub Issues as opposed to comments on single commits.

indygreg · 2023-05-15T02:11:49Z

First 2 PRs from this branch are up:

Need to wait on publishing the next one because it depends on #104491 and I reckon GitHub's already brittle interface for stacking PRs will fall apart with all the automation CPython is using. So I'm not even going to try to do it.

Various rules were only ever invoked once and had minimal bodies. I don't see a benefit to the indirection. So this commit inlines rules to simplify the PGO logic. skip news

This commit overhauls the make-based build system's rules for building optimized binaries. Along the way it fixes a myriad of bugs and shortcomings with the prior approach. The old way of producing optimized binaries had various limitations: * `make [all]` would do work when PGO was enabled because the phony `profile-opt` rule was non-empty. This prevented no-op PGO builds from working at all. This meant workflows like `make; make install` either incurred extra work or failed due to race conditions. * Same thing for BOLT, as its `bolt-opt` rule was also non-empty and always ran during `make [all]`. * BOLT could not be run multiple times without a full rebuild because `llvm-bolt` can't instrument binaries that have already received BOLT optimizations. * It was difficult to run BOLT on its own because of how various make targets and their dependencies were structured. * I found the old way that configure and make communicated the default targets to be confusing and hard to understand. There are essentially 2 major changes going on in this commit: 1. A rework of the high-level make targets for performing a build and how they are defined. 2. A rework of all the make logic related to profile-based optimization (read: PGO and BOLT). Build Target Rework ====================== Before, we essentially had `build_all`, `profile-opt`, `bolt-opt` and `build_wasm` as our 3 targets for performing a build. `all` would alias to one of these, as appropriate. And there was another definition for which _simple_ make target to evaluate for non-optimized builds. This was likely `build_all` or `all`. In the rework, we introduce 2 new high-level targets: * `build-plain` - Perform a build without optimizations. * `build-optimized` - Perform a build with optimizations. `build-plain` is aliased to `build_all` in all configurations except WASM, where it is `build_wasm`. `build-optimized` by default is aliased to a target that prints an error message when optimizations aren't enabled. If PGO or BOLT are enabled, it is aliased to their respective target. `build-optimized` is the logical successor to `profile-opt`. I felt it best to delete `profile-opt` completely, as the new `build-*` high-level targets feel more friendly to use. But if people lament its loss, we can add a `profile-opt: build-optimized` to achieve almost the same result. Profiled-Based Optimization Rework ================================== Most of the make logic related to profile-based optimization (read: PGO and BOLT) has been touched in this change. A major issue with the old way of doing things was we used phony, always-executed make rules. This is a bad practice in make because it undermines no-op builds. Another issue is that the separation between the rules and what order they ran in wasn't always clear. Both PGO and BOLT consist of the same 4 phase solution: instrument, run, analyze, and apply. However, these steps weren't clearly expressed in the make logic. This is especially true for BOLT, which only had 1 make rule. Another issue with BOLT is that it was really easy to get things into a bad state. e.g. if you applied BOLT to `pythonX.Y` you could not run BOLT again unless you rebuilt `pythonX.Y` from source. In the new world, we have separate `profile-<tool>-<stage>-stamp` rules defining the 4 distinct `instrument`, `run`, `analyze`, and `apply` stages for both PGO and BOLT. Each of these stages is tracked by a _stamp_ semaphore file so progress can be captured. This should all be pretty straightforward. There is some minimal complexity here to handle BOLT's optional dependency on PGO, as BOLT either depends on `build_all` or `profile-pgo-apply-stamp`. As part of the refactor to BOLT we also preserve the original input binary before BOLT is applied. This original file is restored if BOLT runs again. This greatly simplifies repeated BOLT invocations, as make doesn't perform needless work. However, this is all best effort, as it is possible for some make target evaluations to still get things in a bad state. Other Remarks ============= If this change perturbs any bugs, they are likely around cleaning behavior. The cleaning rules are a bit complicated and not clearly documented. And I'm unsure which targets CPython developers often iterate on. It is highly possible that state cleanup of PGO and/or BOLT files isn't as robust as it needs to be. I explicitly deleted some calls to PGO cleanup because those calls prevented no-op `make [all]` from working. It is certainly possible something somewhere (release automation?) relied on these files being deleted when they no longer are. We still have targets to purge profile files and it should be trivial to add these to appropriate make rules.

This allows easily customizing the arguments to `llvm-bolt` without having to edit the Makefile. Arguments can be passed to configure and are reflected in configure output, which can be useful for log analysis. When defined in the Makefile we use `?=` syntax so the flags can be overridden via `make VAR=VALUE` syntax or via environment variables. Super useful for iterating on different BOLT flags.

This ensures `LD_LIBRARY_PATH` is set. Without this, I was able to tickle a libpythonX.Y.so not found error.

Before, we only supported running BOLT on the main `python` binary. If a shared library was in play, it wouldn't be optimized. That was leaving a ton of optimization opportunities on the floor. This commit adds support for running BOLT on libpython. Functionality is disabled by default because BOLT asserts on LLVM 15, which is the latest LLVM. I've built LLVM tip and it is able to process libpython just fine. So it is known to work.

The generated sysconfig data during builds encodes a PEP-425 platform tag. During native (target=host) builds, the bootstrapped compiler runs Python code in sysconfig to derive an appropriate value. For cross compiles, we fall back to logic in configure (that code lives around the changed lines) to derive an appropriate platform tag, which is exported as an environment variable during builds. And there is a "backdoor" in `sysconfig.py` that causes `sysconfig.get_platform()` to return that value. The logic in configure for deriving an appropriate platform tag is a far cry from what's in `sysconfig.py`. Ideally that logic would be fully (re)implemented in configure. But that's a non-trivial amount of work. Recognizing that configure makes inadequate platform tag decisions during cross-compiles, this commit switches `_PYTHON_HOST_PLATFORM` from a regular output variable to a "precious variable" (in autoconf speak). This has the side-effect of allowing invokers to define the variable, effectively allowing them to explicitly set the platform tag during builds to a correct value when configure otherwise wouldn't set one.

(This change is a quick and dirty way to merge some of the build system improvements I'm proposing in pythongh-101093 before the 3.12 feature freeze. I wanted to scope bloat myself to fix some longstanding deficiencies in the build system around profile-guided builds. But I'm getting soft resistance to the reviews so close to the freeze deadline and it is obvious that we need a simpler solution to hit the 3.12 deadline. While this change is quick and dirty, it attempts to not make things worse.) Before this change, we only applied bolt to the main python binary. After this change, we apply bolt to libpython if it is configured. In shared library builds, most of the C code is in libpython so it is critical to apply bolt to libpython to realize bolt benefits. This change also reworks how bolt instrumentation is applied. It effectively removes the readelf based logic added in pythongh-101525 and replaces it with a mechanism that saves a copy of the pre-bolt binary and restores that copy when necessary. This allows us to perform bolt optimizations without having to manually delete the output binary to force a new bolt run. We also add a new make target for purging bolt files and hook it up to `clean` so bolt state is purged when appropriate. `.gitignore` rules have been added to ignore files related to bolt. Before and after this refactor, `make` will no-op after a previous run. Both versions should also share common make DAG deficiencies where targets fail to trigger as often as they need to or can trigger prematurely in certain scenarios. e.g. after this change you may need to `rm profile-bolt-stamp` to force a bolt run because there aren't appropriate non-phony targets for bolt's make target to depend on. Fixing this is a non-trivial amount of work that will likely have to wait until the 3.13 window. To make it easier to iterate on custom BOLT settings, the flags to pass to instrumentation and application are now defined in configure and can be overridden by passing `BOLT_INSTRUMENT_FLAGS` and `BOLT_APPLY_FLAGS`.

bedevere-bot added the awaiting review label Jan 16, 2023

corona10 self-requested a review January 17, 2023 00:16

corona10 self-assigned this Jan 17, 2023

indygreg force-pushed the build-system-fixes branch from cc3f6f5 to 2e2ba00 Compare January 17, 2023 03:57

indygreg force-pushed the build-system-fixes branch from 2e2ba00 to fe8e296 Compare January 21, 2023 18:53

erlend-aasland self-requested a review January 21, 2023 20:15

This was referenced Jan 24, 2023

Drop -gdwarf-4 flag from the BOLT build. #101278

Closed

Enhance the BOLT build process #101282

Closed

erlend-aasland added the pending The issue will be closed if no feedback is provided label May 8, 2023

indygreg force-pushed the build-system-fixes branch 2 times, most recently from ff74a68 to c63a03d Compare May 15, 2023 02:03

This was referenced May 15, 2023

gh-101282: move BOLT config after PGO #104493

Merged

Overhaul build system rules related to profiling #104523

Open

indygreg added 4 commits May 15, 2023 18:46

pythongh-104523: inline minimal PGO rules

8eb0b50

Various rules were only ever invoked once and had minimal bodies. I don't see a benefit to the indirection. So this commit inlines rules to simplify the PGO logic. skip news

gh-xxxx: run BOLT instrumented binary with $(RUNSHARED)

02418fd

This ensures `LD_LIBRARY_PATH` is set. Without this, I was able to tickle a libpythonX.Y.so not found error.

indygreg added 2 commits May 15, 2023 18:46

indygreg force-pushed the build-system-fixes branch from c63a03d to 9e50ee0 Compare May 16, 2023 01:47

indygreg mentioned this pull request May 20, 2023

gh-101282: Apply BOLT optimisations to libpython for shared builds #104709

Merged

erlend-aasland removed the pending The issue will be closed if no feedback is provided label Jun 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[WIP] Various build system improvements #101093

[WIP] Various build system improvements #101093

Uh oh!

indygreg commented Jan 16, 2023

Uh oh!

corona10 commented Jan 17, 2023

Uh oh!

aaupov commented Jan 18, 2023

Uh oh!

corona10 commented Jan 18, 2023

Uh oh!

indygreg commented Jan 21, 2023

Uh oh!

corona10 commented Jan 24, 2023 •

edited

Loading

Uh oh!

corona10 commented Jan 24, 2023

Uh oh!

erlend-aasland commented Jan 24, 2023

Uh oh!

erlend-aasland commented May 8, 2023

Uh oh!

indygreg commented May 13, 2023

Uh oh!

erlend-aasland commented May 14, 2023

Uh oh!

indygreg commented May 14, 2023 via email

Uh oh!

erlend-aasland commented May 14, 2023

Uh oh!

indygreg commented May 15, 2023

Uh oh!

Uh oh!

Uh oh!

[WIP] Various build system improvements #101093

Are you sure you want to change the base?

[WIP] Various build system improvements #101093

Uh oh!

Conversation

indygreg commented Jan 16, 2023

Next Steps

Uh oh!

corona10 commented Jan 17, 2023

Uh oh!

aaupov commented Jan 18, 2023

Uh oh!

corona10 commented Jan 18, 2023

Uh oh!

indygreg commented Jan 21, 2023

Uh oh!

corona10 commented Jan 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

corona10 commented Jan 24, 2023

Uh oh!

erlend-aasland commented Jan 24, 2023

Uh oh!

erlend-aasland commented May 8, 2023

Uh oh!

indygreg commented May 13, 2023

Uh oh!

erlend-aasland commented May 14, 2023

Uh oh!

indygreg commented May 14, 2023 via email

Uh oh!

erlend-aasland commented May 14, 2023

Uh oh!

indygreg commented May 15, 2023

Uh oh!

Uh oh!

corona10 commented Jan 24, 2023 •

edited

Loading