-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't copy in From<String> for InternedString
.
#14674
Don't copy in From<String> for InternedString
.
#14674
Conversation
If we have an owned `String` already, then don't clone it just to internalise it.
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @epage (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
let mut cache = interned_storage(); | ||
let s = cache.get(item.as_str()).copied().unwrap_or_else(|| { | ||
let s = item.leak(); | ||
cache.insert(s); | ||
s | ||
}); | ||
|
||
InternedString { inner: s } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without performance data, this falls to being a refactor. My main concern is that we're duplicating the logic for creating an InternedString.
Unsure whether thats sufficient to block this or not. Will think more on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way to handle this is having a Cow
variant of new()
, and then both new()
and from()
call it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it just looked like an attempted optimisation — the impl for an already-owned String
— that wasn't quite complete. I couldn't work out a way to avoid the duplication without allocating on a common path.
Cow
would work but turns a statically known property into a runtime check, which might be reasonable, but requires benchmarking to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uncertainty about the additional complexity is why this was not done originally.
If we want to make it part of the type system without duplicating the core logic we could do something like:
fn new_inner<S: AsRef<str>>(str: S, make_stat: impl Fn(S) -> &'static str) -> InternedString {
let mut cache = interned_storage();
let s = cache.get(str.as_ref()).copied().unwrap_or_else(|| {
let s = make_stat(str);
cache.insert(s);
s
});
InternedString { inner: s }
}
I don't know if it's worth the additional complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo cow offers us a fairly clean design and I don't expect the check will impact performance all that much
### What does this PR try to resolve? Use `Cow` instead of duplicating the interning logic. Closes #14674 ### How should we test and review this PR? ### Additional information
If we have an owned
String
already, then don't clone it just to internalise it.What does this PR try to resolve?
Before,
InternedString::from(String::new("blahblah"))
would take ownership of theString
but then make and leak a copy (str.to_string().leak()
) for use in theinterned_storage()
. This change just avoids that copy, which I think was the intention behind the interface and the interning mechanism.How should we test and review this PR?
Several existing tests hit this implementation, so I'd expect
cargo test
to suffice for functional correctness. The implementation ofInternedString::new()
(a few lines down) is useful as a reference.Additional information
Whilst it looks like an optimisation, I've no particular performance objective here. This was just something I noticed whilst looking around the code.