-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.55.0 segfault compiling stage1 rustc_codegen_llvm on powerpc64le #89744
Comments
That stack trace looks like it's coming from procmacro expansion (cstr), though it doesn't look like that dependency changed. It doesn't seem like anything in particular changed with the proc macro code, either... It would be good to get a better stack trace - at least symbols... |
Here is a stack trace of the failing command. It's a bit spammy; I have the session still open if you want to explore e.g. the local variables or something, let me know what I should run.
|
Hm, yeah, that definitely seems like something is going wrong with the procedural macro interface, but I'm not sure what would be responsible. It looks like rust/library/proc_macro/src/lib.rs Line 259 in 68dfa07
It seems like the segfault suggests that
|
It seems unlikely (AFAICT, the closure code here doesn't use any unstable details) but perhaps #81360 is somehow at fault? (cc @Aaron1011) Maybe you can check a temporary revert of that... |
I've minimised the segfaulting code to: pub use rustc_target;
fn test() {
cstr::cstr!("cmse_nonsecure_call");
} Deleting either the function or the |
Further found out that |
@Mark-Simulacrum unfortunately the build still fails with the same error, after reverting that PR specifically on top of 1.55.0. Wondering if @cuviper has any ideas; apparently 1.55 built fine on Fedora https://kojipkgs.fedoraproject.org//packages/rust/1.55.0/1.el7/data/logs/ppc64le/build.log (though there are more test failures than we usually have on ppc64el in Debian). |
I am trying to figure out if there are any significant differences between rust and Debian's LLVM; is it just the [rust] commits in https://github.com/rust-lang/llvm-project/commits/bdb386270f55cb8e95793daa296f27a95a6d4834 ? |
I can work around the problem by patching away the calls to Separately, is there a proc-macro heavy crate or system of crates I can test this against, to ensure it doesn't break them? |
I noticed that you have a very old To that end, I can reproduce the crash in a |
Thanks for that, then it seems I could probably use the work-around mentioned above, then un-apply it after we do find the time to update cargo in Debian. FWIW I did check over the release notes of cargo and didn't notice anything that stood out. As I understand, the interface between rustc and cargo is a high-level interface involving basic interprocess UNIX things like envvars and command-line flags. (As opposed to, low-level ABI things like linking and CPU/hardware-specific details.) Even if cargo is out-of-sync with rustc, it would just be supplying unexpected command-line options or envvars or something along these lines, which a user could have done manually. Indeed when reproducing this segfault manually for debugging and minimisation, I am only calling rustc and not dealing with cargo at all. Even the dependencies that it's using, were ultimately directly produced by rustc, and only indirectly via cargo. So along these lines, I would imagine that rustc is not supposed to be segfaulting even if cargo goes wildly out-of-date with it, and there is still a bug somewhere? |
It also fails with upstream 0.47.0 / 1.46.0, but it works with cargo 0.48.0 / 1.47.0, and this does have a relevant change:
So if there's a powerpc64le optimization bug arising in proc-macros, as your investigation has pinpointed in
Yes, I agree, failures due to version mismatch should be much more obvious / less harmful than a segfault. |
Another test: using the system cargo and rustc, and the bundled LLVM, it still crashes in the same place. |
Thanks for that, I've also reproduced the same crash with Debian rustc + upstream cargo 1.55.0 by setting |
The crash does not reproduce with plain upstream rustc 1.55, using upstream rustc+cargo 1.54 as bootstrap. I'll try to figure out what the difference is with the Debian rustc that might be triggering this. |
OK, I can reproduce the crash with plain upstream (non-Debian) rustc 1.55 using the following minimal config.toml, and setting
The segfault reproduces with either rls, clippy, rustfmt in place of cargo in tools, and goes away (compiles OK) when replaced with analysis or src. To confirm, the segfault is during building stage1 rustc_codegen_llvm, before the build even gets to building the tools so this is a bit confusing. Would you have any theories @Mark-Simulacrum ? |
This definitely seems.. surprising, I wouldn't expect that stage of the build to even be affected by which tools are enabled/disabled. This is deterministic? Is there any difference in logs (e.g., maybe the features set by Cargo are different for some crate?) I did just notice one interesting thing, that might be worth checking. The compile log you showed above has -Ztls-model=initial-exec being passed, but on our current master, we've disabled that for powerpc targets after #81334 was filed -- #85807 landed in 1.56, so it would not have been included in 1.55. However we added that to the build system in 1.49 (#78201) so I would have expected this problem to arise earlier... Given that the problem seems to be proc-macro related, it seems at least plausible that the cause is due to lacking support in LLVM that has been undetected until now. Procedural macros definitely make extensive use of TLS to hold state while passing it back and forth, so it wouldn't surprise me too much if the bugs are related. So I think next steps:
|
That change for "powerpc-" would only affect 32-bit, not powerpc64le. |
Ah, good catch. Still seems worth finding out if we should actually disable on 64-bit as well... |
We've actually been including #85807 in Debian since 1.49 but yeah that's only for 32-bit ppc. I amended it to include 64-bit ppc but the segfault remains. (And in my original testing, the segfault reproduced without any special flags, although that was just repeatedly attempting to compile rustc_codgen_llvm and leaving the already-built dependencies alone.) In the meantime I've found that the segfault reproduces with CARGO_PROFILE_RELEASE_BUILD_OVERRIDE_OPT_LEVEL=2 but not with =1. Also the extended / tools thing was a just red herring, it was just because I was running Next, I will try to figure out which particular crate's build.rs switching between OPT 1 vs 2 makes the difference. |
Can you see if beta and nightly also fail? Beta has LLVM 13, and nightly has further enabled the new pass manager. |
Beta segfaults, nightly works (with beta as bootstrap). Tested with
It seems cargo doesn't support per-crate overrides, only whole-workspace overrides, so I can't easily track this down. https://doc.rust-lang.org/cargo/reference/profiles.html#overrides Since it's fixed on nightly, perhaps I'll leave the investigation here. |
Failure log (warning very large): 1.55
There was no failure on 1.54; the same version of LLVM was used in both cases - 12.0.1 (Debian version 1:12.0.1-9)
The only change made to rustc_codegen_llvm between 1.54 and 1.55 was #86416 so CC @Amanieu .
There was a later MR #88350 to fix something on powerpc64, however this just mentions "lack of support" rather than a segfault, so I don't know if this is related. Shall I try backporting it onto 1.55 in the meantime?
I will continue trying to debug the segfault.
The output contains the following stack-trace-like dump, not sure if it's useful:
The text was updated successfully, but these errors were encountered: