-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove HashStable impl for collection types with unstable iteration order #533
Comments
This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed. cc @rust-lang/compiler @rust-lang/compiler-contributors |
@rustbot second (accidentally reinvented parts of this design the other day) |
@rustbot label -final-comment-period +major-change-accepted |
MCP 533: rust-lang/compiler-team#533 Also, as an example, substitute UnordMap for FxHashMap in used_trait_imports query result.
…r=lncr Introduce UnordMap, UnordSet, and UnordBag (MCP 533) This is the start of implementing [MCP 533](rust-lang/compiler-team#533). I followed `@eddyb's` suggestion of naming the collection types `Unord(Map/Set/Bag)` which is a bit easier to type than `Unordered(Map/Set/Bag)` r? `@eddyb`
Introduce UnordMap, UnordSet, and UnordBag (MCP 533) This is the start of implementing [MCP 533](rust-lang/compiler-team#533). I followed `@eddyb's` suggestion of naming the collection types `Unord(Map/Set/Bag)` which is a bit easier to type than `Unordered(Map/Set/Bag)` r? `@eddyb`
…t, r=nagisa Add StableOrd trait as proposed in MCP 533. The `StableOrd` trait can be used to mark types as having a stable sort order across compilation sessions. Collections that sort their items in a stable way can safely implement HashStable by hashing items in sort order. See rust-lang/compiler-team#533 for more information.
Add StableOrd trait as proposed in MCP 533. The `StableOrd` trait can be used to mark types as having a stable sort order across compilation sessions. Collections that sort their items in a stable way can safely implement HashStable by hashing items in sort order. See rust-lang/compiler-team#533 for more information.
…s, r=oli-obk Use UnordMap and UnordSet for id collections (DefIdMap, LocalDefIdMap, etc) This PR changes the `rustc_data_structures::define_id_collections!` macro to use `UnordMap` and `UnordSet` instead of `FxHashMap` and `FxHashSet`. This should account for a large portion of hash-maps being used in places where they can cause trouble. The changes required are moderate but non-zero: - In some places the collections are extracted into sorted vecs. - There are a few instances where for-loops have been changed to extends. ~~Let's see what the performance impact is. With a bit more refactoring, we might be able to get rid of some of the additional sorting -- but the change set is already big enough. Unless there's a performance impact, I'd like to do further changes in subsequent PRs.~~ Performance does not seem to be negatively affected ([perf-run here](rust-lang#106977 (comment))). Part of [MCP 533](rust-lang/compiler-team#533). r? `@ghost`
Use UnordMap and UnordSet for id collections (DefIdMap, LocalDefIdMap, etc) This PR changes the `rustc_data_structures::define_id_collections!` macro to use `UnordMap` and `UnordSet` instead of `FxHashMap` and `FxHashSet`. This should account for a large portion of hash-maps being used in places where they can cause trouble. The changes required are moderate but non-zero: - In some places the collections are extracted into sorted vecs. - There are a few instances where for-loops have been changed to extends. ~~Let's see what the performance impact is. With a bit more refactoring, we might be able to get rid of some of the additional sorting -- but the change set is already big enough. Unless there's a performance impact, I'd like to do further changes in subsequent PRs.~~ Performance does not seem to be negatively affected ([perf-run here](rust-lang/rust#106977 (comment))). Part of [MCP 533](rust-lang/compiler-team#533). r? `@ghost`
Replace a number of FxHashMaps/Sets with stable-iteration-order alternatives This PR replaces almost all of the remaining `FxHashMap`s in query results with either `FxIndexMap` or `UnordMap`. The only case that is missing is the `EffectiveVisibilities` struct which turned out to not be straightforward to transform. Once that is done too, we can remove the `HashStable` implementation from `HashMap`. The first commit adds the `StableCompare` trait which is a companion trait to `StableOrd`. Some types like `Symbol` can be compared in a cross-session stable way, but their `Ord` implementation is not stable. In such cases, a `StableCompare` implementation can be provided to offer a lightweight way for stable sorting. The more heavyweight option is to sort via `ToStableHashKey`, but then sorting needs to have access to a stable hashing context and `ToStableHashKey` can also be expensive as in the case of `Symbol` where it has to allocate a `String`. The rest of the commits are rather mechanical and don't overlap, so they are best reviewed individually. Part of [MCP 533](rust-lang/compiler-team#533).
Replace a number of FxHashMaps/Sets with stable-iteration-order alternatives This PR replaces almost all of the remaining `FxHashMap`s in query results with either `FxIndexMap` or `UnordMap`. The only case that is missing is the `EffectiveVisibilities` struct which turned out to not be straightforward to transform. Once that is done too, we can remove the `HashStable` implementation from `HashMap`. The first commit adds the `StableCompare` trait which is a companion trait to `StableOrd`. Some types like `Symbol` can be compared in a cross-session stable way, but their `Ord` implementation is not stable. In such cases, a `StableCompare` implementation can be provided to offer a lightweight way for stable sorting. The more heavyweight option is to sort via `ToStableHashKey`, but then sorting needs to have access to a stable hashing context and `ToStableHashKey` can also be expensive as in the case of `Symbol` where it has to allocate a `String`. The rest of the commits are rather mechanical and don't overlap, so they are best reviewed individually. Part of [MCP 533](rust-lang/compiler-team#533).
Replace a number of FxHashMaps/Sets with stable-iteration-order alternatives This PR replaces almost all of the remaining `FxHashMap`s in query results with either `FxIndexMap` or `UnordMap`. The only case that is missing is the `EffectiveVisibilities` struct which turned out to not be straightforward to transform. Once that is done too, we can remove the `HashStable` implementation from `HashMap`. The first commit adds the `StableCompare` trait which is a companion trait to `StableOrd`. Some types like `Symbol` can be compared in a cross-session stable way, but their `Ord` implementation is not stable. In such cases, a `StableCompare` implementation can be provided to offer a lightweight way for stable sorting. The more heavyweight option is to sort via `ToStableHashKey`, but then sorting needs to have access to a stable hashing context and `ToStableHashKey` can also be expensive as in the case of `Symbol` where it has to allocate a `String`. The rest of the commits are rather mechanical and don't overlap, so they are best reviewed individually. Part of [MCP 533](rust-lang/compiler-team#533).
…llot Replace a number of FxHashMaps/Sets with stable-iteration-order alternatives This PR replaces almost all of the remaining `FxHashMap`s in query results with either `FxIndexMap` or `UnordMap`. The only case that is missing is the `EffectiveVisibilities` struct which turned out to not be straightforward to transform. Once that is done too, we can remove the `HashStable` implementation from `HashMap`. The first commit adds the `StableCompare` trait which is a companion trait to `StableOrd`. Some types like `Symbol` can be compared in a cross-session stable way, but their `Ord` implementation is not stable. In such cases, a `StableCompare` implementation can be provided to offer a lightweight way for stable sorting. The more heavyweight option is to sort via `ToStableHashKey`, but then sorting needs to have access to a stable hashing context and `ToStableHashKey` can also be expensive as in the case of `Symbol` where it has to allocate a `String`. The rest of the commits are rather mechanical and don't overlap, so they are best reviewed individually. Part of [MCP 533](rust-lang/compiler-team#533).
Replace a number of FxHashMaps/Sets with stable-iteration-order alternatives This PR replaces almost all of the remaining `FxHashMap`s in query results with either `FxIndexMap` or `UnordMap`. The only case that is missing is the `EffectiveVisibilities` struct which turned out to not be straightforward to transform. Once that is done too, we can remove the `HashStable` implementation from `HashMap`. The first commit adds the `StableCompare` trait which is a companion trait to `StableOrd`. Some types like `Symbol` can be compared in a cross-session stable way, but their `Ord` implementation is not stable. In such cases, a `StableCompare` implementation can be provided to offer a lightweight way for stable sorting. The more heavyweight option is to sort via `ToStableHashKey`, but then sorting needs to have access to a stable hashing context and `ToStableHashKey` can also be expensive as in the case of `Symbol` where it has to allocate a `String`. The rest of the commits are rather mechanical and don't overlap, so they are best reviewed individually. Part of [MCP 533](rust-lang/compiler-team#533).
…ctiveVisibilities. Part of rust-lang/compiler-team#533
…is, r=<try> Use FxIndexMap instead FxHashMap to stabilize iteration order in EffectiveVisibilities Part of [MCP 533](rust-lang/compiler-team#533).
…is, r=cjgillot Use FxIndexMap instead FxHashMap to stabilize iteration order in EffectiveVisibilities Part of [MCP 533](rust-lang/compiler-team#533).
…is, r=cjgillot Use FxIndexMap instead FxHashMap to stabilize iteration order in EffectiveVisibilities Part of [MCP 533](rust-lang/compiler-team#533).
…illot Use FxIndexMap instead FxHashMap to stabilize iteration order in EffectiveVisibilities Part of [MCP 533](rust-lang/compiler-team#533).
…ctiveVisibilities. Part of rust-lang/compiler-team#533
Replace a number of FxHashMaps/Sets with stable-iteration-order alternatives This PR replaces almost all of the remaining `FxHashMap`s in query results with either `FxIndexMap` or `UnordMap`. The only case that is missing is the `EffectiveVisibilities` struct which turned out to not be straightforward to transform. Once that is done too, we can remove the `HashStable` implementation from `HashMap`. The first commit adds the `StableCompare` trait which is a companion trait to `StableOrd`. Some types like `Symbol` can be compared in a cross-session stable way, but their `Ord` implementation is not stable. In such cases, a `StableCompare` implementation can be provided to offer a lightweight way for stable sorting. The more heavyweight option is to sort via `ToStableHashKey`, but then sorting needs to have access to a stable hashing context and `ToStableHashKey` can also be expensive as in the case of `Symbol` where it has to allocate a `String`. The rest of the commits are rather mechanical and don't overlap, so they are best reviewed individually. Part of [MCP 533](rust-lang/compiler-team#533).
…illot Use FxIndexMap instead FxHashMap to stabilize iteration order in EffectiveVisibilities Part of [MCP 533](rust-lang/compiler-team#533).
Introduce UnordMap, UnordSet, and UnordBag (MCP 533) This is the start of implementing [MCP 533](rust-lang/compiler-team#533). I followed `@eddyb's` suggestion of naming the collection types `Unord(Map/Set/Bag)` which is a bit easier to type than `Unordered(Map/Set/Bag)` r? `@eddyb`
Introduce UnordMap, UnordSet, and UnordBag (MCP 533) This is the start of implementing [MCP 533](rust-lang/compiler-team#533). I followed `@eddyb's` suggestion of naming the collection types `Unord(Map/Set/Bag)` which is a bit easier to type than `Unordered(Map/Set/Bag)` r? `@eddyb`
Replace a number of FxHashMaps/Sets with stable-iteration-order alternatives This PR replaces almost all of the remaining `FxHashMap`s in query results with either `FxIndexMap` or `UnordMap`. The only case that is missing is the `EffectiveVisibilities` struct which turned out to not be straightforward to transform. Once that is done too, we can remove the `HashStable` implementation from `HashMap`. The first commit adds the `StableCompare` trait which is a companion trait to `StableOrd`. Some types like `Symbol` can be compared in a cross-session stable way, but their `Ord` implementation is not stable. In such cases, a `StableCompare` implementation can be provided to offer a lightweight way for stable sorting. The more heavyweight option is to sort via `ToStableHashKey`, but then sorting needs to have access to a stable hashing context and `ToStableHashKey` can also be expensive as in the case of `Symbol` where it has to allocate a `String`. The rest of the commits are rather mechanical and don't overlap, so they are best reviewed individually. Part of [MCP 533](rust-lang/compiler-team#533).
…illot Use FxIndexMap instead FxHashMap to stabilize iteration order in EffectiveVisibilities Part of [MCP 533](rust-lang/compiler-team#533).
… r=<try> Stabilize order of MonoItems in CGUs and disallow query_instability lint for rustc_monomorphize The HashStable impl for `CodegenUnit` was incorrect as described in [MCP 533](rust-lang/compiler-team#533). This PR removes any indeterminism from the way codegen units are built. The changes are pretty straightforward. Part of rust-lang#84447 and [MCP 533](rust-lang/compiler-team#533).
… r=oli-obk Stabilize order of MonoItems in CGUs and disallow query_instability lint for rustc_monomorphize The HashStable impl for `CodegenUnit` was incorrect as described in [MCP 533](rust-lang/compiler-team#533). This PR removes any indeterminism from the way codegen units are built. The changes are pretty straightforward. Part of rust-lang#84447 and [MCP 533](rust-lang/compiler-team#533).
Stabilize order of MonoItems in CGUs and disallow query_instability lint for rustc_monomorphize The HashStable impl for `CodegenUnit` was incorrect as described in [MCP 533](rust-lang/compiler-team#533). This PR removes any indeterminism from the way codegen units are built. The changes are pretty straightforward. Part of rust-lang/rust#84447 and [MCP 533](rust-lang/compiler-team#533).
…elwoerister Un-unsafe the `StableOrd` trait Whilst incorrect implementations of this trait can cause miscompilation, they cannot cause memory unsafety in rustc. [Discussed on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Policy.20of.20.60unsafe.60.20within.20the.20compiler). cc [MCP 533](rust-lang/compiler-team#533), rust-lang#105175, `@michaelwoerister` r? `@Nilstrieb`
…woerister Un-unsafe the `StableOrd` trait Whilst incorrect implementations of this trait can cause miscompilation, they cannot cause memory unsafety in rustc. [Discussed on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Policy.20of.20.60unsafe.60.20within.20the.20compiler). cc [MCP 533](rust-lang/compiler-team#533), rust-lang#105175, `@michaelwoerister` r? `@Nilstrieb`
Un-unsafe the `StableOrd` trait Whilst incorrect implementations of this trait can cause miscompilation, they cannot cause memory unsafety in rustc. [Discussed on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Policy.20of.20.60unsafe.60.20within.20the.20compiler). cc [MCP 533](rust-lang/compiler-team#533), #105175, `@michaelwoerister` r? `@Nilstrieb`
Stabilize order of MonoItems in CGUs and disallow query_instability lint for rustc_monomorphize The HashStable impl for `CodegenUnit` was incorrect as described in [MCP 533](rust-lang/compiler-team#533). This PR removes any indeterminism from the way codegen units are built. The changes are pretty straightforward. Part of rust-lang/rust#84447 and [MCP 533](rust-lang/compiler-team#533).
Un-unsafe the `StableOrd` trait Whilst incorrect implementations of this trait can cause miscompilation, they cannot cause memory unsafety in rustc. [Discussed on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Policy.20of.20.60unsafe.60.20within.20the.20compiler). cc [MCP 533](rust-lang/compiler-team#533), #105175, `@michaelwoerister` r? `@Nilstrieb`
Un-unsafe the `StableOrd` trait Whilst incorrect implementations of this trait can cause miscompilation, they cannot cause memory unsafety in rustc. [Discussed on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Policy.20of.20.60unsafe.60.20within.20the.20compiler). cc [MCP 533](rust-lang/compiler-team#533), #105175, `@michaelwoerister` r? `@Nilstrieb`
Proposal
This MCP has the goal of removing unsound
HashStable
impls that currently exist for various widely used collection types, in particular collections derived fromstd::collections::HashMap
andstd::collections::HashSet
(e.g.FxHashMap
,DefIdSet
, etc), and to restrict theHashStable
impls forOrd
-based collections (likeBTreeMap
andSortedMap
) to instances with keys that have a stable sort order across compilation session boundaries.HashMap / HashSet
The iteration order of
HashMap
andHashSet
depends on the hash value of keys. This hash value, however, very often is not stable across compilation session boundaries because it can be computed from memory addresses (e.g.Interned<T>
) or unstable IDs (likeDefIndex
orCrateNum
). The currentHashStable
implementation "solves" this problem by computing an iteration-order-independent fingerprint for these collections -- but this is not sound since the iteration order can influence the outcome of query computation and thus the fingerprint must take it into account. Worst case, this fingerprint inaccuracy can lead to silent miscompilations.In addition to the problem of hash values, the iteration order of these collections is not defined in any way, i.e. the documentation gives no guarantees. In particular, it is not guaranteed that
hash_set.iter()
has the same order asdecode(encode(hash_set)).iter()
, so even for keys with stable hash values (e.g.String
), we cannot rely on iteration order staying the same across compilation session boundaries.What to use instead of HashMap / HashSet
Various forms of
HashMap
andHashSet
are widely used in the compiler. Removing theirHashStable
implementation means that they cannot be used in query keys and query results anymore. Luckily, there are a couple of good alternatives:We already use
FxIndexMap
andFxIndexSet
in many places. These are high-performance hashing-based collections with the additional guarantee of having a stable iteration order. TheirHashStable
implementation is valid and they can be used as a drop-in replacement ofHashMap
andHashSet
in most circumstances.The MCP proposes to add two new collection types
UnorderedMap
andUnorderedSet
torustc_data_structures
that provide the same interface as other map and set collection types, but that do a better job of hiding iteration order. These types can be a good choice in cases where a collection does not have any ordering at the conceptual level. One advantage they have, is that they "erase" the iteration order and thus can safely be built from non-deterministic sources -- such as a regularFxHashMap
or via multi-threaded code.Interaction with
potential_query_instability
lintMCP 465 introduced the
potential_query_instability
lint, which tries to solve much of the same problem as the changes proposed here. However, as discussed on Zulip, neither the lint, nor the changes here will catch all problematic cases. E.g. withUnorderedMap
we can allow safe instances of collection iteration (likemap.iter().any(predicate)
) in a single place at the API-level, where with the lint each usage instance would have to be audited and#[allow()]
ed individually. At the same time, the lint can catch cases where we cannot enforce aHashStable
bound, like when using anFxHashMap
locally inside a query implementation.So this MCP does not propose to change anything about the lint (other than applying it more widely
:)
)Ord
-based collectionsIn a way,
Ord
-based collections likeBTreeMap
andSortedMap
are in a better position thanHashMap
because they have a well-defined iteration order. We just have to make sure that the keys being sorted by do not introduce instability. To this end, the MCP proposes to add aStableOrd
trait:With this trait available, we can implement
HashStable
only for collection instances where the key has such a stable sort order.The MCP proposes to implement the trait sparingly, since there is no good, generic way to writing regression tests that make sure a given type has a stable sort order (if you know of one, tell us!).
Mentors or Reviewers
Someone from @rust-lang/wg-incr-comp?
Process
The main points of the Major Change Process are as follows:
@rustbot second
.-C flag
, then full team check-off is required.@rfcbot fcp merge
on either the MCP or the PR.You can read more about Major Change Proposals on forge.
Comments
This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.
The text was updated successfully, but these errors were encountered: