Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unix: BOLT fixes #463

Merged
merged 1 commit into from
Jan 1, 2025
Merged

unix: BOLT fixes #463

merged 1 commit into from
Jan 1, 2025

Conversation

indygreg
Copy link
Collaborator

@indygreg indygreg commented Jan 1, 2025

As part of investigating failures with BOLT when upgrading to LLVM 19, I found and fixed a few issues with BOLT.

First, test_embed had been segfaulting on BOLT instrumented binaries. Why I'm not entirely sure. But the segfault only seems to occur in instrumentation mode. These tests are doing low-level things with the interpreter. So I suspect some kind of global mutable state issue or something.

I found the exact tests triggering the segfaults and added annotations to skip them.

The CPython build system treats the segfault as fatal on 3.13 but not 3.12. This means that on 3.12 we were only running a subset of tests and not collecting BOLT instrumentation nor applying optimizations for all tests after test_embed.

The removal of the segfault enables us to enable BOLT on 3.13+.

Second, LLVM 19.x has a hard error when handling PIC compiled functions containing computed gotos. It appears prior versions of LLVM could silently have buggy behavior in this scenario. We need to skip functions with computed gotos to allow LLVM 19.x to work with BOLT. It makes sense to apply this patch before LLVM 19.x upgrade to prevent bugs with computed gotos.

Third, I noticed BOLT was complaining about the lack of -update-debug-sections during instrumentation.

The 2nd and 3rd issues require common arguments to both BOLT instrumentation and application invocations. The patch fixing both introduces a new configure variable to hold common BOLT arguments. This patch is a good candidate for upstreaming.

@indygreg
Copy link
Collaborator Author

indygreg commented Jan 1, 2025

Bleh. More test_embed failures on 3.14 that need resolved...

@zanieb zanieb self-requested a review January 1, 2025 21:26
@zanieb zanieb added platform:darwin Specific to the macOS platform platform:linux Specific to the Linux platform labels Jan 1, 2025
As part of investigating failures with BOLT when upgrading to LLVM
19, I found and fixed a few issues with BOLT.

First, `test_embed` had been segfaulting on BOLT instrumented binaries.
Why I'm not entirely sure. But the segfault only seems to occur in
instrumentation mode. These tests are doing low-level things with the
interpreter. So I suspect some kind of global mutable state issue
or something.

I found the exact tests triggering the segfaults and added annotations
to skip them.

The CPython build system treats the segfault as fatal on 3.13 but
not 3.12. This means that on 3.12 we were only running a subset of tests
and not collecting BOLT instrumentation nor applying optimizations for
all tests after `test_embed`.

The removal of the segfault enables us to enable BOLT on 3.13+.

Second, LLVM 19.x has a hard error when handling PIC compiled functions
containing computed gotos. It appears prior versions of LLVM could
silently have buggy behavior in this scenario. We need to skip functions
with computed gotos to allow LLVM 19.x to work with BOLT. It makes sense
to apply this patch before LLVM 19.x upgrade to prevent bugs with
computed gotos.

Third, I noticed BOLT was complaining about the lack of
`-update-debug-sections` during instrumentation.

The 2nd and 3rd issues require common arguments to both BOLT
instrumentation and application invocations. The patch fixing both
introduces a new configure variable to hold common BOLT arguments.
This patch is a good candidate for upstreaming.
Comment on lines -391 to -394
# Due to a SEGFAULT when running `test_embed` with BOLT instrumented binaries, we can't use
# BOLT on Python 3.13+.
# TODO: Find a fix for this or consider skipping these tests specifically
echo "BOLT is disabled on Python 3.13+"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! Thank you.

# On 3.12 (minimum BOLT version), the segfault causes the test harness to
# abort and BOLT optimization uses the partial test results. On 3.13, the segfault
# is a fatal error.
if [ -n "${PYTHON_MEETS_MINIMUM_VERSION_3_10}" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these also be gated by -n "${BOLT_CAPABLE}" so we don't disable tests unnecessarily?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually run the full suite of tests. So it doesn't matter today.

(It would be nice to actually run the full test suite against the built distribution but that's for another PR I suppose.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't these be reflected in the distributed Python too? Like, a consumer would have these tests disabled?

Agree it doesn't seem critical.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, I forgot we distributed the unit tests.

I'll file an issue to clean things up. Agree we could do something better here. And we may be paving over a legit bug somewhere by disabling tests that segfault.

@indygreg indygreg merged commit 4859cdf into main Jan 1, 2025
297 checks passed
@indygreg indygreg deleted the gps/bolt-fixes branch January 1, 2025 23:08
zanieb pushed a commit that referenced this pull request Jan 2, 2025
This is a redo of #420, which was merged prematurely.

With the BOLT changes from #463 merged, LLVM 19 _just works_.

As part of this we also modernize the BOLT apply settings to follow the
recommendations at
https://llvm.org/devmtg/2024-03/slides/practical-use-of-bolt.pdf.

This includes enabling support for loading hot code from a huge page at
runtime. This should _just work_ and could result in perf wins via
improved iTLB hit rate, etc.
@zanieb
Copy link
Member

zanieb commented Jan 4, 2025

Sort of tragically, I cannot reproduce the test_embed failures in CPython alone so I'm going to have to start using components here one by one until it occurs? Or attach a debugger to this build? I'm not sure. It's looking hard.

@zanieb
Copy link
Member

zanieb commented Jan 4, 2025

Also, I'm not seeing anything related to

Second, LLVM 19.x has a hard error when handling PIC compiled functions containing computed gotos. It appears prior versions of LLVM could silently have buggy behavior in this scenario. We need to skip functions with computed gotos to allow LLVM 19.x to work with BOLT.

Am I missing some configuration to encounter this behavior?

edit: I reproduced this with a configure invocation copied from CI... just need to minimize it now.

@zanieb
Copy link
Member

zanieb commented Jan 4, 2025

Ah okay --enable-shared is needed to reproduce both of these failures — and second fix is required before I can get nice test_embed segfaults :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:darwin Specific to the macOS platform platform:linux Specific to the Linux platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants