Skip to content

Workaround for windows-gnu rust-lld test failure #140396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 29, 2025

Conversation

ChrisDenton
Copy link
Member

@ChrisDenton ChrisDenton commented Apr 28, 2025

The test run-make/amdgpu-kd has an issue on windows-gnu where rust-lld will sometimes fail with error 0xc0000374 (STATUS_HEAP_CORRUPTION).

This works around the issue by passing --threads=1 to the linker as suggested here. Note I don't know if this will help and it happens only sometimes in our CI so it's hard to test.

@rustbot
Copy link
Collaborator

rustbot commented Apr 28, 2025

r? @jieyouxu

rustbot has assigned @jieyouxu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 28, 2025
@rustbot
Copy link
Collaborator

rustbot commented Apr 28, 2025

This PR modifies run-make tests.

cc @jieyouxu

@ChrisDenton
Copy link
Member Author

@bors try

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 28, 2025
Workaround for windows-gnu rust-lld test failure

The test run-make/amdgpu-kd has an issue on windows-gnu where rust-lld will sometimes fail with error 0xc0000374 (`STATUS_HEAP_CORRUPTION`).

This works around the issue by passing `--threads=1` to the linker as suggested [here](rust-lang#115985 (comment)). Note I don't know if this will help and it happens only sometimes in our CI so it's hard to test.

try-job: x86_64-mingw-1
@bors
Copy link
Collaborator

bors commented Apr 28, 2025

⌛ Trying commit b107e98 with merge 46b91f6...

Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is one of the tests where this can happen, but there's also other tests that run into rust-lld heap corruption.

r=me if the try-job comes back green, I guess even r=me if it fails due to heap corruption anyway (hopefully single thread makes it less likely)...

@petrochenkov
Copy link
Contributor

@ChrisDenton Have you seen the RUSTC_RETRY_LINKER_ON_SEGFAULT hack in the compiler? 😄
If the issue may appear on something different than tests/run-make/amdgpu-kd/rmake.rs then something like that may make sense on Windows as well.

@jieyouxu
Copy link
Member

Lmao I did not know that was a thing

@ChrisDenton
Copy link
Member Author

Me neither, ha.

Though this issue does seem to be specific to this windows-gnu, rust-lld and this test in particular. I looked at some of the older test fails and they don't seem to exist any more (I guess this test replaced them?). Like tests/ui/amdgpu-require-explicit-cpu.rs

@jieyouxu
Copy link
Member

jieyouxu commented Apr 28, 2025

I believe there's also avr-rjmp-offset where we disabled the test on windows-gnu because of rust-lld too (#133480)

@ChrisDenton
Copy link
Member Author

ChrisDenton commented Apr 28, 2025

Ah, that's already disabled so I didn't see it in the logs. Also it is a STATUS_ACCESS_VIOLATION instead of a STATUS_HEAP_CORRUPTION but it possibly has a similar cause.

@jieyouxu
Copy link
Member

@ChrisDenton Have you seen the RUSTC_RETRY_LINKER_ON_SEGFAULT hack in the compiler? 😄 If the issue may appear on something different than tests/run-make/amdgpu-kd/rmake.rs then something like that may make sense on Windows as well.

Apparently that was for a similarly cursed problem, #38878.

// Here's a terribly awful hack that really shouldn't be present in any
// compiler. Here an environment variable is supported to automatically
// retry the linker invocation if the linker looks like it segfaulted.
//
// Gee that seems odd, normally segfaults are things we want to know
// about! Unfortunately though in rust-lang/rust#38878 we're
// experiencing the linker segfaulting on Travis quite a bit which is
// causing quite a bit of pain to land PRs when they spuriously fail
// due to a segfault.
//
// The issue #38878 has some more debugging information on it as well,
// but this unfortunately looks like it's just a race condition in
// macOS's linker with some thread pool working in the background. It
// seems that no one currently knows a fix for this so in the meantime
// we're left with this...
if !retry_on_segfault || i > 3 {
break;
}
let msg_segv = "clang: error: unable to execute command: Segmentation fault: 11";
let msg_bus = "clang: error: unable to execute command: Bus error: 10";

@rust-log-analyzer

This comment has been minimized.

@ChrisDenton
Copy link
Member Author

ChrisDenton commented Apr 28, 2025

So I think I'd personally prefer not to pile on the hack if it can be avoided. If this PR fixes the issue and we don't get more cropping up then I'd like to leave that alone. If it does crop up again then adding 0xc0000374 and 0xc0000005 to the hack would be the least worst option, I guess.

I do wonder if we could work around this in run-make. The thing the avr test and this one have in common is cross-compiling for a more specialised target. But I'd like to first confirm that setting --threads=1 does in fact help.

@jieyouxu
Copy link
Member

Yeah, that is reasonable

@jieyouxu
Copy link
Member

@bors r+ rollup

@bors
Copy link
Collaborator

bors commented Apr 28, 2025

📌 Commit 52594ef has been approved by jieyouxu

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 28, 2025
@rust-log-analyzer

This comment has been minimized.

@ChrisDenton
Copy link
Member Author

ChrisDenton commented Apr 28, 2025

That seems to an entirely different spurious failure 😅. A stack overflow in mir-opt\building\issue_49232.rs, We've had other such spurious errors before but they seem to come and go, e.g. #138110

@jieyouxu
Copy link
Member

That is... yeah.

@jieyouxu
Copy link
Member

Oh I didn't realize the try-job didn't finish... Just in case
@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 28, 2025
@jieyouxu
Copy link
Member

@bors r+

@bors
Copy link
Collaborator

bors commented Apr 28, 2025

📌 Commit 52594ef has been approved by jieyouxu

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 28, 2025
ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 28, 2025
Workaround for windows-gnu rust-lld test failure

The test run-make/amdgpu-kd has an issue on windows-gnu where rust-lld will sometimes fail with error 0xc0000374 (`STATUS_HEAP_CORRUPTION`).

This works around the issue by passing `--threads=1` to the linker as suggested [here](rust-lang#115985 (comment)). Note I don't know if this will help and it happens only sometimes in our CI so it's hard to test.
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 28, 2025
…enton

Rollup of 9 pull requests

Successful merges:

 - rust-lang#139308 (add autodiff inline)
 - rust-lang#140276 (Do not compute type_of for impl item if impl where clauses are unsatisfied)
 - rust-lang#140302 (Move inline asm check to typeck, properly handle aliases)
 - rust-lang#140323 (Implement the internal feature `cfg_target_has_reliable_f16_f128`)
 - rust-lang#140374 (Resolve instance for SymFn in global/naked asm)
 - rust-lang#140391 (Rename sub_ptr to offset_from_unsigned in docs)
 - rust-lang#140394 (Make bootstrap git tests more self-contained)
 - rust-lang#140396 (Workaround for windows-gnu rust-lld test failure)
 - rust-lang#140402 (only return nested goals for `Certainty::Yes`)

r? `@ghost`
`@rustbot` modify labels: rollup
ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 28, 2025
Workaround for windows-gnu rust-lld test failure

The test run-make/amdgpu-kd has an issue on windows-gnu where rust-lld will sometimes fail with error 0xc0000374 (`STATUS_HEAP_CORRUPTION`).

This works around the issue by passing `--threads=1` to the linker as suggested [here](rust-lang#115985 (comment)). Note I don't know if this will help and it happens only sometimes in our CI so it's hard to test.
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 28, 2025
…enton

Rollup of 10 pull requests

Successful merges:

 - rust-lang#139308 (add autodiff inline)
 - rust-lang#139656 (Stabilize `slice_as_chunks` library feature)
 - rust-lang#140022 (allow deref patterns to move out of boxes)
 - rust-lang#140276 (Do not compute type_of for impl item if impl where clauses are unsatisfied)
 - rust-lang#140302 (Move inline asm check to typeck, properly handle aliases)
 - rust-lang#140323 (Implement the internal feature `cfg_target_has_reliable_f16_f128`)
 - rust-lang#140391 (Rename sub_ptr to offset_from_unsigned in docs)
 - rust-lang#140394 (Make bootstrap git tests more self-contained)
 - rust-lang#140396 (Workaround for windows-gnu rust-lld test failure)
 - rust-lang#140402 (only return nested goals for `Certainty::Yes`)

Failed merges:

 - rust-lang#139765 ([beta] Delay `hash_extract_if` stabilization from 1.87 to 1.88)

r? `@ghost`
`@rustbot` modify labels: rollup
The test run-make/amdgpu-kd has an issue where rust-lld will sometimes fail with error 0xc0000374 (STATUS_HEAP_CORRUPTION).
@ChrisDenton
Copy link
Member Author

Oh, oops, I need to be more explicit about types. I forgot it's ignored on my local machine

@bors r=jieyouxu

@bors
Copy link
Collaborator

bors commented Apr 28, 2025

📌 Commit 3c42dc2 has been approved by jieyouxu

It is now in the queue for this repository.

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 29, 2025
…enton

Rollup of 10 pull requests

Successful merges:

 - rust-lang#139308 (add autodiff inline)
 - rust-lang#139656 (Stabilize `slice_as_chunks` library feature)
 - rust-lang#140022 (allow deref patterns to move out of boxes)
 - rust-lang#140276 (Do not compute type_of for impl item if impl where clauses are unsatisfied)
 - rust-lang#140302 (Move inline asm check to typeck, properly handle aliases)
 - rust-lang#140323 (Implement the internal feature `cfg_target_has_reliable_f16_f128`)
 - rust-lang#140391 (Rename sub_ptr to offset_from_unsigned in docs)
 - rust-lang#140394 (Make bootstrap git tests more self-contained)
 - rust-lang#140396 (Workaround for windows-gnu rust-lld test failure)
 - rust-lang#140402 (only return nested goals for `Certainty::Yes`)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 469f03d into rust-lang:master Apr 29, 2025
6 checks passed
@rustbot rustbot added this to the 1.88.0 milestone Apr 29, 2025
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 29, 2025
Rollup merge of rust-lang#140396 - ChrisDenton:gnu-threads, r=jieyouxu

Workaround for windows-gnu rust-lld test failure

The test run-make/amdgpu-kd has an issue on windows-gnu where rust-lld will sometimes fail with error 0xc0000374 (`STATUS_HEAP_CORRUPTION`).

This works around the issue by passing `--threads=1` to the linker as suggested [here](rust-lang#115985 (comment)). Note I don't know if this will help and it happens only sometimes in our CI so it's hard to test.
@ChrisDenton ChrisDenton deleted the gnu-threads branch April 29, 2025 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants