-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only use the new DepNode hashmap for anonymous nodes. #109050
base: master
Are you sure you want to change the base?
Conversation
r? @oli-obk (rustbot has picked a reviewer for you, use r? to override) |
Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 3a4d8fe9baa63dd9e4ffb6fa799b846fd806c00e with merge b6efc766824a4dfccb5f423fd5d3ed04e6329538... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
This does remove an important assertion which checks for duplicate key hashes. I think it makes sense to keep the map and assertions for I'm not sure why this PR marks all the issues as fixed, as it just shifts the problem to the next session. |
Having duplicates among dep-nodes is not really an issue for the current compilation session. This is only unsupported to deserialize it. I can definitely add a diagnostic to report which node was duplicated. I considered keeping that map set of existing nodes. That created a lot of complexity (mostly cfgs), for an unsure gain.
The "ICE", which is the problem from the point of view of the use, is removed. Instead, rustc gracefully recovers, which is IMO a better behaviour. I can also refine the recovery to exclude duplicated nodes, but keep the rest of the graph here. This gives me another idea: filter anonymous nodes out of the deserialized index, as those cannot have an equivalent in the current session. |
Finished benchmarking commit (b6efc766824a4dfccb5f423fd5d3ed04e6329538): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
cc @michaelwoerister as you reviewed quite a few PRs on the query system. |
Thanks for the PR, @cjgillot! I'll take a closer look within the next two days. |
Yes, it should be correct. I'm just concerned about the hash collisions being harder to reproduce and easier to ignore due as the errors are "recovered". Having a command line option to make these hard errors would be useful for testing on CI and rustc-perf. I'm not sure if we have larger incremental crate tests on CI at all? I'm sure we have some UI tests which this PR would invalidate. Maybe we'd need to change the test runner to do another pass to verify that the dep graph also loads.
Sure, but we should still track and fix these issues. |
We could also keep ICEing on nightly and beta, and just doing the convenient thing for users on stable. Either way, this needs a summary of the behaviour changes and a compiler team FCP |
Before going to the FCP part, I went further on the idea of recovering from duplicates. That's what anonymous nodes do, so we can go even further and implement anonymous nodes by filtering them out of the TBH, I'm not sure that we should merge that last commit, so my "actual" PR goes up to the "Always recover from duplicate" commit 76a6d17662e373f779b4cefd9c6e2c8f858f583d. |
I'm not sure if silently de-duplicating actually ever happened (except for anonymous nodes). Conceptually it should always be an error if two query invocations map to the same dep-node, right? This is what this assertion is there for. If I remember correctly, this assertion has caught multiple invalid I'm definitely in favor of reducing the dep-graphs memory footprint, however 🙂 Is there a way to keep the sanity checks but do them in a less expensive way? I think the check during deserialization is pretty clever, but it does make it harder to find out where the problem is. |
I don't think I follow. As I understand it, there should never be duplicate dep-nodes to begin with. The only exception are anonymous nodes where we introduce them explicitly, but otherwise we expect to have a 1:1 correspondence between DepNodes and query keys, right? |
I'm not sure how we can have both. Checking duplicates needs to have everything in memory at the same time somehow.
There are 2 ways to see this. The current understanding is that we have a graph where the Another understanding is that we have 2 graphs where the |
Maybe it would make sense to put the check behind a -Z flag and only populate the map when the check is enabled? Then if we get a report that something is wrong (because of the cheaper check during deserialization) we can find the root cause more easily. |
☔ The latest upstream changes (presumably #108524) made this pull request unmergeable. Please resolve the merge conflicts. |
…=Nilstrieb Use an IndexVec to debug fingerprints. Uncontroversial part of rust-lang#109050
I think I'd prefer the behavior of having an ICE when we detect duplicates, but also erasing the incremental cache, allowing the next compilation session to succeed. That would be harder to miss than a warning for users. |
That sounds reasonable to me. Although I'm wondering if that would get users stuck in an "ICE loop", where compilation sessions alternate between ICEing and succeeding as long as the problematic piece of code exists (because the conditions for DepNode identifier collisions are inherent to the program being compiled). But maybe the ICE message could contain an explanation of the problem and instructions on how to turn off incremental compilation? |
It does not. It only maps from DepNode to SerializedDepNodeIndex once on entry. The graph walk only uses SerializedDepNodeIndex. It only fetches the DepNode for debugging and to test for eval always / forcable query. It does not use those DepNode to map back to an index, or to a color, this is done directly using the SerializedDepNodeIndex.
My proposal is to stop ICEing in both cases. I don't understand how that's worse. |
Isn't this a place where things can go wrong? What do you think about focusing this PR on getting rid of the big |
For
Split in #112469 |
Thanks for splitting out #112469! I think the other changes in this PR could basically be seen as a proposal to change how DepNodes are defined. Right now a (non-anonymous) DepNode represents a single query invocation, that is, there is a 1:1 mapping between DepNodes and query keys. Under that paradigm, the fact that DepNodes internally contain a fingerprint is just an implementation detail. We require these fingerprints to be effectively unique -- or we would have to replace them with something else (like a serialized version of the corresponding query key; which is how they actually were implemented in early versions of the system). The contested changes in this PR, however, in a way amount to making DepNodes a "best-effort" concept: except for DepNode kinds where we want to reconstruct the query key, the system only assumes that DepNodes are a hint for finding the right SerializedDepNodeIndex when invoking I think this is really interesting! It would potentially enabled a number of things:
So, overall, I think this is really interesting and promising. I just don't think it would be good form to implement a rather fundamental shift like this "under that radar" as a drive-by fix and without updating our documentation, right? |
☔ The latest upstream changes (presumably #110050) made this pull request unmergeable. Please resolve the merge conflicts. |
…ingerprints, r=<try> Experiment: Only track fingerprints for queries with reconstructible dep-nodes. This is an experiment to collect performance data about alternative ways to adapt rust-lang#109050. The PR makes the following change: All queries with keys that are not reconstructible from their corresponding DepNode are now treated similar to anonymous queries. That is we don't compute a DepNode or result fingerprint for them. This has some implications: - We save time because query keys and results don't have to be hashed. - We can save space storing less data for these nodes in the on-disk dep-graph. (not implemented in this PR as I ran out of time. Maybe this would be a quick fix for `@saethlin` though?) - We don't have to worry about hash collisions for DepNode in these cases (although we still have to worry about hash collisions for result fingerprints, which might include all the same HashStable impls) - Same as with anonymous queries, the graph can grow additional nodes and edges in some situations because existing graph parts might be promoted while new parts are allocated for the same query if it is re-executed. I don't know how much this happens in practice. - We cannot cache query results for queries with complex keys. Given that that last point affects some heavy queries, I have my doubts that this strategy is a win. But let's run it through perf at least once. cc `@cjgillot,` `@Zoxc` r? `@ghost`
3eef8ed
to
381c7a3
Compare
This comment has been minimized.
This comment has been minimized.
Always recover from duplicate DepNode.
As the hash function for the input is not complete, those queries can trigger a re-use of a DepNode. As we now tolerate having duplicated DepNode, we effectively remove this ICE.
381c7a3
to
99ecfbc
Compare
Hi @cjgillot ! What's the status of this? I see it's still tagged that it needs an FCP, is it the current status? Thanks if you can post an update (or if you still have interest/bandwidth to pursuit this goal) :-) |
Removing a review assignee for now, waiting to clarify the status |
☔ The latest upstream changes (presumably #132282) made this pull request unmergeable. Please resolve the merge conflicts. |
Profiling the compilation of the windows crate with massif showed that a hashmap
DepNode -> DepNodeIndex
accounted for 6Gb in the peak memory usage for the windows crate.This PR aims to remove this map.
There are 3 hashmaps of
DepNode
in the query system:fingerprints
map which is used for debugging.new_node_to_index
map which is used to deduplicate nodes and check for their existence.Hashmap 1 is unavoidable, as this is the only way to make a correspondence between two compilation sessions.
The second commit replaces hashmap 2 by a simple
IndexVec<DepNodeIndex, Option<Fingerprint>>
.Hashmap 2 is the more interesting. Having duplicate DepNodes in the serialized graph is not supported. So the current solution was to either ICE when creating duplicates, or deduplicate them silently using the
new_node_to_index
map.The third commit moves the burden of checking for duplicates to dep-graph deserialization, which fails when there are duplicates. Instead of an ICE, we silently clear the incremental session, and continue compilation.
The remaining source of duplicates if the creation of anonymous nodes, for which the
DepNode
is just the hash of the dependencies' indices. Thenew_node_to_index
map is shrunk to only be used for such anonymous nodes.Those changes allow to go from 21 Gb to 16 Gb peak memory usage on that crate.
Drive-by: the first commits fixes #101518 by marking the affected query as anonymous. As we remove the check for duplicated DepNodes, the ICE would have been replaced by a silent clearing of the incremental state, which would have been unfortunate.
https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/improving.20rustc.20memory.20usage
Fixes #83085
Fixes #101518
Fixes #106136
Fixes #107991
Fixes #108657