-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String interning #385
Comments
Let’s call Ideally to reduce copying we’d convert strings to The AST however is backed by Rowan, which has its own opinion about string storage. I’m not sure how to reconcile both. |
Deduplicating indentical strings is nice to save memory and to make A smart string type with reference counting (fast clone) and pre-computed hash (fast |
The WIP We decided against pre-computed hashing, since hashing a pre-computed 8 bytes hash is not much less work than hashing names that are rarely dozens of bytes. Deduplication is also not done. An explicit per-document interner would not work since a |
#868 changes the memory representation of Reopening to track ideas in case we want to reconsider this later. |
I just came across this post again, about the cost of many threads cloning the same If the Router ever has a tight loop that clones a |
My impression is that apollo-federation today clones |
The HIR in
apollo-compiler
usesString
a lot. Cloning, hashing, and comparing those strings repeatedly has cost that can add up. This can be solved by having some kind of string "interning" that deduplicates strings in memory, so that for example equality would be as cheap as a pointer comparison.Some secondary goals are:
AsRef<str>
,Eq
,Hash
, etc. That is, using them does not require a separate access a string pool / interner.With a quick search I found a number of interning crates that don’t do all of this. Servo’s string-cache does but comes with non-trivial complexity for features that are maybe not as useful here. (Namely: multiple static sets of "well-known" strings that can be interned without allocation and in
const
context, small string optimization)Perhaps something custom to apollo-rs would make sense. I think the design might involve atomic reference-counting, precomputed hash, and a process-global mutex-protected hash table. It could use https://crates.io/crates/weak-table, or something even simpler since we don’t need a weak counter (as it would always be 1). But maybe a full custom hash table is too much wheel-reinvention.
The text was updated successfully, but these errors were encountered: