Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run-make-fulldeps/pgo-branch-weights fails on AArch64 Linux #78226

Closed
pietroalbini opened this issue Oct 22, 2020 · 23 comments
Closed

run-make-fulldeps/pgo-branch-weights fails on AArch64 Linux #78226

pietroalbini opened this issue Oct 22, 2020 · 23 comments
Labels
C-bug Category: This is a bug. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state P-critical Critical priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Milestone

Comments

@pietroalbini
Copy link
Member

The most recent builds on AArch64 Linux are failing on this test:

2020-10-22T08:53:26.0713848Z ---- [run-make] run-make-fulldeps/pgo-branch-weights stdout ----
2020-10-22T08:53:26.0723060Z 
2020-10-22T08:53:26.0727694Z error: make failed
2020-10-22T08:53:26.0732974Z status: exit code: 2
2020-10-22T08:53:26.0738071Z command: "make"
2020-10-22T08:53:26.0742831Z stdout:
2020-10-22T08:53:26.0750881Z ------------------------------------------
2020-10-22T08:53:26.0763884Z # We don't compile `opaque` with either optimizations or instrumentation.
2020-10-22T08:53:26.0783737Z LD_LIBRARY_PATH="/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights:/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/aarch64-unknown-linux-gnu/stage0-bootstrap-tools/aarch64-unknown-linux-gnu/release/deps:/checkout/obj/build/aarch64-unknown-linux-gnu/stage0/lib" '/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights -L /checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights   opaque.rs || exit 1
2020-10-22T08:53:26.0791260Z # Compile the test program with instrumentation
2020-10-22T08:53:26.0794012Z mkdir -p "/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights/prof_data_dir" || exit 1
2020-10-22T08:53:26.0803441Z LD_LIBRARY_PATH="/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights:/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/aarch64-unknown-linux-gnu/stage0-bootstrap-tools/aarch64-unknown-linux-gnu/release/deps:/checkout/obj/build/aarch64-unknown-linux-gnu/stage0/lib" '/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights -L /checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights   interesting.rs \
2020-10-22T08:53:26.0813242Z 	-Cprofile-generate="/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights/prof_data_dir" -O -Ccodegen-units=1 || exit 1
2020-10-22T08:53:26.0824507Z LD_LIBRARY_PATH="/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights:/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/aarch64-unknown-linux-gnu/stage0-bootstrap-tools/aarch64-unknown-linux-gnu/release/deps:/checkout/obj/build/aarch64-unknown-linux-gnu/stage0/lib" '/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights -L /checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights   main.rs -Cprofile-generate="/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights/prof_data_dir" -O || exit 1
2020-10-22T08:53:26.0833472Z # The argument below generates to the expected branch weights
2020-10-22T08:53:26.0841671Z LD_LIBRARY_PATH="/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights:/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/lib/rustlib/aarch64-unknown-linux-gnu/lib:/checkout/obj/build/aarch64-unknown-linux-gnu/stage0-bootstrap-tools/aarch64-unknown-linux-gnu/release/deps:/checkout/obj/build/aarch64-unknown-linux-gnu/stage0/lib" /checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights/main aaaaaaaaaaaa2bbbbbbbbbbbb2bbbbbbbbbbbbbbbbcc || exit 1
2020-10-22T08:53:26.0849409Z "/checkout/obj/build/aarch64-unknown-linux-gnu/llvm/build/bin/llvm-profdata" merge \
2020-10-22T08:53:26.0852410Z 	-o "/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights/prof_data_dir/merged.profdata" \
2020-10-22T08:53:26.0855865Z 	"/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights/prof_data_dir" || exit 1
2020-10-22T08:53:26.0864853Z LD_LIBRARY_PATH="/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights:/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/aarch64-unknown-linux-gnu/stage0-bootstrap-tools/aarch64-unknown-linux-gnu/release/deps:/checkout/obj/build/aarch64-unknown-linux-gnu/stage0/lib" '/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights -L /checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights   interesting.rs \
2020-10-22T08:53:26.0874373Z 	-Cprofile-use="/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights/prof_data_dir/merged.profdata" -O \
2020-10-22T08:53:26.0876942Z 	-Ccodegen-units=1 --emit=llvm-ir || exit 1
2020-10-22T08:53:26.0880673Z cat "/checkout/obj/build/aarch64-unknown-linux-gnu/test/run-make-fulldeps/pgo-branch-weights/pgo-branch-weights/interesting.ll" | "/checkout/obj/build/aarch64-unknown-linux-gnu/llvm/build/bin/FileCheck" filecheck-patterns.txt
2020-10-22T08:53:26.0883188Z 
2020-10-22T08:53:26.0884121Z ------------------------------------------
2020-10-22T08:53:26.0884722Z stderr:
2020-10-22T08:53:26.0885663Z ------------------------------------------
2020-10-22T08:53:26.0887152Z filecheck-patterns.txt:5:8: error: CHECK: expected string not found in input
2020-10-22T08:53:26.0889855Z CHECK: define void @function_called_twice(i32 %c) {{.*}} !prof [[function_called_twice_id:![0-9]+]] {
2020-10-22T08:53:26.0890870Z        ^
2020-10-22T08:53:26.0891548Z <stdin>:1:1: note: scanning from here
2020-10-22T08:53:26.0893037Z ; ModuleID = 'interesting.3a1fbbbh-cgu.0'
2020-10-22T08:53:26.0893984Z ^
2020-10-22T08:53:26.0894732Z <stdin>:7:1: note: possible intended match here
2020-10-22T08:53:26.0895788Z define void @function_called_twice(i32 %c) unnamed_addr #0 {
2020-10-22T08:53:26.0896578Z ^
2020-10-22T08:53:26.0896922Z 
2020-10-22T08:53:26.0897496Z Input file: <stdin>
2020-10-22T08:53:26.0898859Z Check file: filecheck-patterns.txt
2020-10-22T08:53:26.0899576Z 
2020-10-22T08:53:26.0900782Z -dump-input=help explains the following input dump.
2020-10-22T08:53:26.0901455Z 
2020-10-22T08:53:26.0902005Z Input was:
2020-10-22T08:53:26.0902571Z <<<<<<
2020-10-22T08:53:26.0903979Z            1: ; ModuleID = 'interesting.3a1fbbbh-cgu.0'
2020-10-22T08:53:26.0905561Z check:5'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
2020-10-22T08:53:26.0908184Z            2: source_filename = "interesting.3a1fbbbh-cgu.0"
2020-10-22T08:53:26.0909684Z check:5'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2020-10-22T08:53:26.0911280Z            3: target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
2020-10-22T08:53:26.0912831Z check:5'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2020-10-22T08:53:26.0914249Z            4: target triple = "aarch64-unknown-linux-gnu"
2020-10-22T08:53:26.0915583Z check:5'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2020-10-22T08:53:26.0916181Z            5: 
2020-10-22T08:53:26.0917010Z check:5'0     ~
2020-10-22T08:53:26.0918142Z            6: ; Function Attrs: noinline nonlazybind uwtable
2020-10-22T08:53:26.0919605Z check:5'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2020-10-22T08:53:26.0920542Z            7: define void @function_called_twice(i32 %c) unnamed_addr #0 {
2020-10-22T08:53:26.0921934Z check:5'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2020-10-22T08:53:26.0923158Z check:5'1     ?                                                            possible intended match
2020-10-22T08:53:26.0923948Z            8: start:
2020-10-22T08:53:26.0924831Z check:5'0     ~~~~~~
2020-10-22T08:53:26.0925481Z            9:  %0 = icmp eq i32 %c, 50
2020-10-22T08:53:26.0926480Z check:5'0     ~~~~~~~~~~~~~~~~~~~~~~~~
2020-10-22T08:53:26.0927169Z           10:  br i1 %0, label %bb2, label %bb1
2020-10-22T08:53:26.0928608Z check:5'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2020-10-22T08:53:26.0929312Z           11: 
2020-10-22T08:53:26.0930503Z check:5'0     ~
2020-10-22T08:53:26.0931285Z           12: bb1: ; preds = %start
2020-10-22T08:53:26.0932694Z check:5'0     ~~~~~~~~~~~~~~~~~~~~~
2020-10-22T08:53:26.0933375Z            .
2020-10-22T08:53:26.0934086Z            .
2020-10-22T08:53:26.0934834Z            .
2020-10-22T08:53:26.0935442Z >>>>>>
2020-10-22T08:53:26.0936135Z make: *** [Makefile:42: all] Error 1
2020-10-22T08:53:26.0936642Z 
2020-10-22T08:53:26.0937694Z ------------------------------------------
2020-10-22T08:53:26.0938123Z 
2020-10-22T08:53:26.0938461Z 
2020-10-22T08:53:26.0938808Z 
2020-10-22T08:53:26.0939349Z failures:
2020-10-22T08:53:26.0940684Z     [run-make] run-make-fulldeps/pgo-branch-weights
2020-10-22T08:53:26.0941493Z 
2020-10-22T08:53:26.0942824Z test result: FAILED. 211 passed; 1 failed; 7 ignored; 0 measured; 0 filtered out
@pietroalbini pietroalbini added O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state C-bug Category: This is a bug. labels Oct 22, 2020
@pietroalbini
Copy link
Member Author

@rustbot ping arm

@rustbot
Copy link
Collaborator

rustbot commented Oct 22, 2020

Hey ARM Group! This bug has been identified as a good "ARM candidate".
In case it's useful, here are some instructions for tackling these sorts of
bugs. Maybe take a look?
Thanks! <3

cc @JamieCunliffe @joaopaulocarreiro @raw-bin @Stammark @vigoux

@JohnTitor
Copy link
Member

Looks like my rollup (#78212) has the cause.

@raw-bin
Copy link

raw-bin commented Oct 22, 2020

@JohnTitor: Thanks for the tip! We're looking into this.

@JamieCunliffe
Copy link
Contributor

Based on that I reverted #77554 (git revert 813066c4429e7582b08dbf3af2c12a5f2e1b2a16 -m 1) and the test then passed.

@pietroalbini
Copy link
Member Author

This seems to be fixed now, CI is passing again.

@Mark-Simulacrum
Copy link
Member

Reopening this issue as it's hitting us (again) on the beta promotion PR, #86413.

I've tried various things to debug but so far haven't really arrived at any conclusions. It doesn't seem immediately related to anything in the PR itself, which should in theory have minimal effect on runtime behavior here.

@rustbot ping arm -- since this is only failing on the aarch64 builder, I am currently presuming some relationship, but it's not clear. The test does not fail on the only aarch64 machine I currently readily have access to when I tried to run it indirectly (i.e., copying the input files and simulating the make file via some local manual invocations)>

@rustbot
Copy link
Collaborator

rustbot commented Jun 20, 2021

Error: Parsing ping command in comment failed: ...'t ping arm' | error: expected end of command at >| ' -- since '...

Please let @rust-lang/release know if you're having trouble with this bot.

@Mark-Simulacrum
Copy link
Member

@rustbot ping arm

@rustbot
Copy link
Collaborator

rustbot commented Jun 20, 2021

Hey ARM Group! This bug has been identified as a good "ARM candidate".
In case it's useful, here are some instructions for tackling these sorts of
bugs. Maybe take a look?
Thanks! <3

cc @adamgemmell @hug-dev @jacobbramley @JamieCunliffe @joaopaulocarreiro @raw-bin @Stammark @vigoux

@jacobbramley
Copy link
Contributor

@Mark-Simulacrum It's not obvious what's going on here, but we'll look into it. Thanks for the ping.

@Mark-Simulacrum Mark-Simulacrum added A-spurious Area: Spurious failures in builds (spuriously == for no apparent reason) and removed A-spurious Area: Spurious failures in builds (spuriously == for no apparent reason) labels Jun 21, 2021
@Mark-Simulacrum Mark-Simulacrum added this to the 1.54.0 milestone Jun 22, 2021
@Mark-Simulacrum Mark-Simulacrum added P-critical Critical priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 22, 2021
@Mark-Simulacrum
Copy link
Member

I've disabled the test on #86413 so that we can get a beta out, but I would like to see a fix prior to the 1.54 release (in ~5 weeks), so P-critical and milestoned at 1.54.

@pnkfelix
Copy link
Member

@jacobbramley are you willing to own this issue, so that we have some tracking of who is looking into it before the 1.54 release happens?

@jacobbramley
Copy link
Contributor

@jacobbramley are you willing to own this issue, so that we have some tracking of who is looking into it before the 1.54 release happens?

Currently, @JamieCunliffe (also in the Arm team) is working on it so he's perhaps a better choice.

@JamieCunliffe
Copy link
Contributor

@Mark-Simulacrum mentioned they couldn't reproduce this locally, I have managed to reproduce it locally with the following config which I got by running one of the CI scripts.

changelog-seen = 2
[llvm]
assertions = true
[build]
submodules = false
locked-deps = true
profiler = true
[install]
[rust]
codegen-units-std = 1
debug-assertions = true
channel = 'beta'
verbose-tests = true
dist-src = false
[target.aarch64-unknown-linux-gnu]
[dist]
compression-formats = ['xz']

That fails on master 50e0cc59ffcacda5b48f4edb95e5a5c353624fb0 and beta bf62f4de32a085c1080d9e77b1b73f1a8e42cce2 for me locally. Although the test is passing in the CI for master currently.

Changing the codegen-units-std to be 16 appears to fix the test for me locally...

I'll continue investigating this though.

@JamieCunliffe
Copy link
Contributor

According to a bisect it was caused by #85891 not sure why that revert is causing it yet though.

Given there is already a revert open (#86143) what's the best way to test it on the beta CI to see if it was actually the cause?

@Mark-Simulacrum
Copy link
Member

Thanks for the bisection! I suspect the best way is to try to land a revert (of the revert? not sure how revert-y we've gotten there) on the beta branch, while enabling this test (reverting 793b005).

It seems plausible at least that this is not an entirely correct bisection in terms of underlying root cause based on a cursory inspection of these PRs. My guess is something subtle within the linker or so is at play... but I'm OK closing this issue if we can get the test re-enabled on beta. However, I suspect we'll not want to backport #86143 to beta -- it's a pretty large PR. Checking whether it fixes things by temporary landing it on the beta branch (or with a try build on the aarch64 builder) seems OK though. We can use that information to determine what the best next step is.

One thought on a potential cause is that if PGO is affected (and I'd guess it is) by the codegen-units and relative ordering of symbols, then #85891 certainly seems like it might've influenced both of those...

cc @bjorn3 @michaelwoerister -- any thoughts on how #85891 might've caused this bug?

@bjorn3
Copy link
Member

bjorn3 commented Jul 6, 2021

cc @bjorn3 @michaelwoerister -- any thoughts on how #85891 might've caused this bug?

It changes symbol names and probably the order of some hashmaps. This might have changed the codegened code just enough to expose a pre-existing bug I would guess.

@michaelwoerister
Copy link
Member

The linker certainly can be a source for PGO related errors, e.g. if the profiling runtime doesn't get linked in correctly.

@JamieCunliffe
Copy link
Contributor

It does look like it's to do with the linker. It looks like the profile data section isn't being included correctly in the binary, the counts and other sections appear to be correct though. The data section size on a working one is almost the same as the difference we are seeing on the CI.

When using gold as the linker for the test it passes but the default doesn't, which could explain why we aren't seeing this on x86.

When using -C link-dead-code the test passes, not using the --gc-sections flag when doing profile-generate would be an option that would fix this. To me that seems a reasonable fix rather than using a different linker as part of the test. This will result in a larger binary but I don't think that should matter too much as this binary should only be used for generating profile information. It looks like something similar has been observed here: https://bugzilla.mozilla.org/show_bug.cgi?id=1641674

If that seems reasonable I can submit a PR for not using gc-sections with profile-generate.

@Mark-Simulacrum
Copy link
Member

FWIW we try to use gold on the linker on non-x86 I think

ifeq ($(UNAME),Linux)
ifneq (,$(findstring x86,$(TARGET)))
COMMON_FLAGS=-Clink-args=-fuse-ld=gold
endif
endif
-- maybe that's not working though.

I'm not sure if it's the right fix to always not gc-sections but I think it's likely a good idea for now -- seems pretty harmless, too; profile-generate isn't really deployed much AFAIK, only run on particular benchmarks or so.

@jacobbramley
Copy link
Contributor

To be clear, hacking rustc to pass --print-gc-sections as well as just --gc-sections, it's clear that the default linker is deliberately stripping the __llvm_prf_data section. It's present in all the intermediate steps (even the .o files). It gets restored for the final binary (with dummy contents) by compiler-rt/lib/profile/InstrProfilingPlatformLinux.c, and that's why we see that section in the result, but the parts we need have already been stripped by then.

What's not clear to me is what the correct behaviour is, and which tool is misbehaving (if any).

@Mark-Simulacrum That test is, confusingly, the other way around, and applies only to x86 targets. gold does work for AArch64 Linux (and we verified with --print-gc-sections that it doesn't strip the profile data sections).

bors added a commit to rust-lang-ci/rust that referenced this issue Jul 18, 2021
…-Simulacrum

Don't use gc-sections with profile-generate.

When building with profile-generate don't call gc_sections as this can
can sometimes strip out profile data. This missing information in the
prof files can then result in missing functions when using the profile
information.

rust-lang#78226

r? `@Mark-Simulacrum`
@Mark-Simulacrum
Copy link
Member

T-compiler declined beta backport of #87004, but given that we are currently of the opinion that the particular test failing is somewhat unlikely to be a "new" bug, I'm going to close this issue as fixed by #87004. In particular, it is my current understanding that this issue could have arisen on any past release with the right perturbations to the linker input (largely driven by hashes differing due to rustc version amongst other details), so is not a true regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state P-critical Critical priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants