Skip to content

Deterministic query cycles for parallel front-end#149849

Open
zetanumbers wants to merge 6 commits intorust-lang:mainfrom
zetanumbers:deterministic_cycles
Open

Deterministic query cycles for parallel front-end#149849
zetanumbers wants to merge 6 commits intorust-lang:mainfrom
zetanumbers:deterministic_cycles

Conversation

@zetanumbers
Copy link
Contributor

@zetanumbers zetanumbers commented Dec 10, 2025

The mechanism is similar to cycle detection in single-threaded mode. We traverse the deadlocked query graph from the top active query downwards to subqueries until we visit some query a second time, thus finding a cycle. With multi-thread front-end enabled one query may now have more than one active subqueries, aka we used one of parallel interfaces parallel!, join, par_for_each, etc. As such we have to traverse the "leftmost" active subquery to recover the sequential behavior of these parallel interfaces in single-threaded mode. New TreeNodeIndex saves implicit context information about what join (or scope) task we entered while executing a query, which we then use in break_query_cycle.

However we then have to guarantee the query stack from single-threaded mode is included in the active query graph. This is true for join function as their first task will be completed on the same thread and same will be tried for the second task unless stolen which is fine for us. scope places tasks in local queue and pops them in LIFO maner, while other worker threads could only steal from that queue in FIFO maner, thus we can guarantee the next task is either stolen or available for execution.

Fixes #142064
Fixes #142063
Fixes #127971

UPDATE: commits are sliced to the finest detail

@rustbot rustbot added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 10, 2025
@rustbot
Copy link
Collaborator

rustbot commented Dec 10, 2025

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

@rustbot
Copy link
Collaborator

rustbot commented Dec 10, 2025

r? @eholk

rustbot has assigned @eholk.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rust-log-analyzer

This comment has been minimized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, first of all, thank you for your work on this. I can see a lot has been done to address the issue!

Unable to speak about entire implementation since don't know much about this part, but one note on this particular file: the logic seems a bit unclear to me due to the magic numbers and bit manipulations. Even though the file isn't large, adding a few explanatory comments would be a big help for clarity

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we'll need a regression test to check it doesn't ICE with these changes

Copy link
Contributor Author

@zetanumbers zetanumbers Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about now? I also moved it to tree_node_index.rs.

@matthiaskrgr
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Dec 10, 2025
Deterministic query cycles for parallel front-end
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 10, 2025
@Kivooeo
Copy link
Member

Kivooeo commented Dec 10, 2025

@matthiaskrgr there is a compilation error in this PR, not sure if this can be benchmarked?

Ok this is surprising

@rust-bors
Copy link
Contributor

rust-bors bot commented Dec 10, 2025

☀️ Try build successful (CI)
Build commit: 306a768 (306a768e2f03217e62aee6655739249be1d8aa95, parent: 377656d3dd3f9c23a9c8713e163f4365a5261a84)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (306a768): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.5% [0.2%, 1.8%] 155
Regressions ❌
(secondary)
0.7% [0.2%, 1.9%] 138
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.0% [-0.0%, -0.0%] 1
All ❌✅ (primary) 0.5% [0.2%, 1.8%] 155

Max RSS (memory usage)

Results (secondary -1.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.3% [2.3%, 2.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.1% [-3.5%, -0.9%] 3
All ❌✅ (primary) - - 0

Cycles

Results (primary 2.2%, secondary 2.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.2% [2.0%, 2.4%] 3
Regressions ❌
(secondary)
3.6% [2.8%, 5.3%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-4.6% [-4.6%, -4.6%] 1
All ❌✅ (primary) 2.2% [2.0%, 2.4%] 3

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 471.42s -> 474.532s (0.66%)
Artifact size: 389.04 MiB -> 389.08 MiB (0.01%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 10, 2025
@Zoxc
Copy link
Contributor

Zoxc commented Dec 14, 2025

It's a bit unclear to me what source of non-determinism this tries to fix. Does this alter parallel execution in any way other than picking which query in the cycle use for resumption? It looks to me like you're trying to pick the point in the cycle to break which corresponds to a single threaded execution, but I don't think that point is guaranteed to be resumable.

Currently it looks like we're not deterministically picking which query in a cycle to resume when multiple is present. That's fairly simple to improve, though I think the queries available for resumption is non-deterministic to start with.

@zetanumbers
Copy link
Contributor Author

It's a bit unclear to me what source of non-determinism this tries to fix. Does this alter parallel execution in any way other than picking which query in the cycle use for resumption? It looks to me like you're trying to pick the point in the cycle to break which corresponds to a single threaded execution, but I don't think that point is guaranteed to be resumable.

I make an assumption that when we get a query cycle there is the "point in the cycle to break which corresponds to a single threaded execution" which currently rustc_thread_scope::scope violates due to the order of task executions (and which I plan to replace with join). If you traverse query graph in a "single-threaded manner" you will eventually find a thread waiting for already visited query to finish because graph is finite and every active query during a deadlock has a subquery (either direct or waiting on) to go to next. That last query wait closing a loop has to be resumed. I assume any query can be reached by traversing down to some subquery from the root query which I also assume is unique.

though I think the queries available for resumption is non-deterministic to start with.

That's what I am trying to make deterministic.

@rustbot

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@reddevilmidzy reddevilmidzy added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 9, 2026
@rustbot
Copy link
Collaborator

rustbot commented Jan 13, 2026

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

@rustbot

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 23, 2026
@zetanumbers
Copy link
Contributor Author

zetanumbers commented Feb 5, 2026

cc @petrochenkov

@rustbot

This comment has been minimized.

@rust-bors

This comment has been minimized.

@zetanumbers
Copy link
Contributor Author

Already? Are you freaking serious?

@rust-log-analyzer

This comment has been minimized.

@rustbot

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@zetanumbers
Copy link
Contributor Author

cc @Zoxc

@Zoxc
Copy link
Contributor

Zoxc commented Feb 6, 2026

Say we have a query cycle a() -> b() -> a().

And a program:
join(|| a(), || b())

On a single thread we take the left task and end up resuming b() giving it the error value of a().

Now consider a parallel case:

Thread 2 runs the right side first so we get the stack:
right task -> b() -> a() -> waiting on b()

Then Thread 1 runs the left side and gets the following stack:
left task -> waiting on a()

All threads are waiting so we enter the deadlock handler, and to match the single thread case we need to resume b(), however none of our threads can do that.

@zetanumbers
Copy link
Contributor Author

zetanumbers commented Feb 6, 2026

So we do the next most deterministic thing and resume thread 1 calling cycle hander for a() query just like with single-threaded front-end.

We can guarantee some thread waits on a() because it the first repeating query in the left-most query stack, meaning that if those weren't waits they would have to be an execution. Query system makes sure there could be only one execution, so they have to be the same exact query call and have same query parents by key (wait or execution). Thus there is a contradiction as that query parent is repeated in the left-most query stack before the first repeating query.

@zetanumbers
Copy link
Contributor Author

zetanumbers commented Feb 6, 2026

We can do that because any query call in a cycle outputs an ill-defined value from a compiler's standpoint and isn't being dependent on until compiler abort/exits. Except for representability, layout_of and type_of_opaque for which I assume output is consistent. At the time of writing this code I think I've ran query cycles tests with multi-threading enabled using a hack, but after you mentioned it I have to add -Zthreads=8 (and better run those repeatedly) profile to those tests.

Representability passes through Representability::Infinite, so that checks out.

Type_of_opaque seems to boil down to call from either type_of or constituent_types_for_auto_trait where first reduces to TyKind::Error and second is not clear to me yet and could require fixing the trait system.

@zetanumbers
Copy link
Contributor Author

zetanumbers commented Feb 6, 2026

Query layout_of seems to propagate cycle error too.

Although new cycles could be introduced by multi-threaded front-end too. If those cycles are disjoint then I assume those are independent and are valid cycles to report. Otherwise as in your example the right thread waiting on b() may only raise fatal error and never ICE as we picked the right first query to call cycle handler for, registering an error for dcx.abort_if_errors().

@zetanumbers
Copy link
Contributor Author

zetanumbers commented Feb 6, 2026

This PR wouldn't solve #142063's family of bugs related to diagnostic inconsistencies, but in this instance it is solved due to layout_of cycle handler reporting recursive async fn errors immediately.

I should add all this explanation to new cycle breaking code and query cycle handler code too.

@zetanumbers
Copy link
Contributor Author

zetanumbers commented Feb 6, 2026

Either way I don't see what could this code possibly do worse for picking and resuming queries. Although I might not know if the old query cycle breaking code does anything special aside from... breaking those cycles with no preference.

@rustbot

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented Feb 6, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@zetanumbers
Copy link
Contributor Author

I've sliced these commits as granulary as I have ever done till this day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ICE: parallel: internal error: entered unreachable code diagnostics flipflopping with Zthreads ICE: only 'variances_of' returns '&[ty::Variance]'