Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain all the stable hashing shenanigans #203

Open
RalfJung opened this issue Sep 20, 2018 · 10 comments
Open

Explain all the stable hashing shenanigans #203

RalfJung opened this issue Sep 20, 2018 · 10 comments
Labels
A-incr-comp Area: incremental compilation C-enhancement Category: enhancement E-help-wanted Call for participation: extra help is wanted E-medium Difficulty: might require some prior knowledge or code reading E-needs-writeup Call for participation: discussion can be written up without much research required T-compiler Relevant to compiler team

Comments

@RalfJung
Copy link
Member

RalfJung commented Sep 20, 2018

So rustc is full of these impl_stable_hash_for. What are these for? I originally thought that would be for FxHashMap, but that seems to be wrong (I still need to derive(Hash) to use FxHashMap). So now I am just confused. It would be great if the guide could explain that.

@eddyb said "incremental" but that on its own does not explain much of anything -- why is Hash not good enough? Why do I need a tcx to compute a "stable hash"?

@RalfJung
Copy link
Member Author

I was told that @michaelwoerister knows all about this? :D

@eddyb
Copy link
Member

eddyb commented Sep 20, 2018

Our Hash impls hash pointers, IDs, etc. - they're designed for efficiency, not stability.

Stability here is across compilations, and it means the hash depends on semantic data, not transient representation.
tcx is needed to e.g. convert an ID into its "stable" representation / get a cached hash.

@RalfJung
Copy link
Member Author

What kind of "ID" are you referring to?

@RalfJung
Copy link
Member Author

Okay so "stable" here means "guaranteed not to change between rustc invocations". We must not hash pointers, for example. Good to know.

@michaelwoerister
Copy link
Member

"Stable" here means stable across compilation sessions and crate boundaries. For example, if you Hash a Ty you get a different value in two different compiler processes (because you are actually hashing a pointer to an interned data structure). If you StableHash it, the hash value will be the same for different invocations of the compiler, and it will also be the same, independently of whether the type was defined in the current crate being compiled or if it was loaded from an upstream crate.

This is used for telling if something has changed in between to sessions (for incr. comp.) without actually having to have the value stored somewhere. Another example is the hash value at the end of every Rust symbol. This also needs be stable across sessions and crate boundaries.

@RalfJung
Copy link
Member Author

@michaelwoerister thanks, that helps! Why does this kind of stability require access to a "context" (StableHashingContext) though?

@eddyb
Copy link
Member

eddyb commented Sep 20, 2018

@RalfJung To cache some kinds of more expensive hashes and to look up IDs (NodeId, DefId, etc.), as you can't hash the numerical value of the ID, but rather the "definition" that it refers to.

@michaelwoerister
Copy link
Member

It's not just caching. NodeId, DefId, Span, etc are not stable things. The context provides the data needed for mapping them into a stable format. For example mapping Span from a u32 to file:line:col.

@eddyb
Copy link
Member

eddyb commented Sep 21, 2018

That's what I meant by "looking up IDs".

@michaelwoerister
Copy link
Member

Right, I wasn't reading your answer properly :)

@mark-i-m mark-i-m added E-help-wanted Call for participation: extra help is wanted E-medium Difficulty: might require some prior knowledge or code reading E-hard Difficulty: might require advanced knowledge labels Sep 21, 2018
@mark-i-m mark-i-m added E-easy Difficulty: might be a good place for a beginner E-needs-writeup Call for participation: discussion can be written up without much research required and removed E-hard Difficulty: might require advanced knowledge E-medium Difficulty: might require some prior knowledge or code reading labels May 6, 2020
@jieyouxu jieyouxu added E-medium Difficulty: might require some prior knowledge or code reading C-enhancement Category: enhancement T-compiler Relevant to compiler team A-incr-comp Area: incremental compilation and removed E-easy Difficulty: might be a good place for a beginner labels Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-incr-comp Area: incremental compilation C-enhancement Category: enhancement E-help-wanted Call for participation: extra help is wanted E-medium Difficulty: might require some prior knowledge or code reading E-needs-writeup Call for participation: discussion can be written up without much research required T-compiler Relevant to compiler team
Projects
None yet
Development

No branches or pull requests

5 participants