-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace String
with SmolStr
for nodes Identifier
and ExprName
#9202
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
The code changes look good to me. The two main questions that I have are:
- How did you decide on
smol_str
over any of the other small string crates? For example,compact_str
supports up to 24 bytes and 12 bytes on 32-bit architectures. But I'm unfamiliar with all the crates, so there might be other considerations. The intention isn't that you change the small string crate, but that we document how we made the decision in case we want to re-visit it in the future. - How did you decide on where to use
SmolStr
? E.g. you use it for Names and Identifiers but not for String literals.
@@ -131,7 +131,7 @@ fn get_undecorated_methods( | |||
|
|||
if let Expr::Name(ast::ExprName { id, .. }) = &arguments.args[0] { | |||
if target_name == *id { | |||
explicit_decorator_calls.insert(id.clone(), stmt.range()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GitHub doesn't allow me to make a suggestion on the entire code. We can avoid calling to_string
here by changing explicit_decorator_calls
to HashMap<&str, TextRange>
.
The id.to_string
call on linne 128 also seems unnecessary (we can use the SmolStr
directly or use as_str
)
@@ -1739,7 +1740,7 @@ impl From<ExprStarred> for Expr { | |||
#[derive(Clone, Debug, PartialEq)] | |||
pub struct ExprName { | |||
pub range: TextRange, | |||
pub id: String, | |||
pub id: SmolStr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you decide where to use SmolStr
? Should ExprIpyEscapeCommand
, StringLiteral
, and FStringElement
use SmolStr
too or is it intentional that they do not because they're less likely to be short?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already answered most of this question in another comment, but I didn't mention ExprIpyEscapeCommand
. I haven't seen many examples of real world use of ExprIpyEscapeCommand
so I don't know if they are usually short, like Names
and Identifiers
. Can't really tell if they are a suitable candidate to use SmolStr
.
I haven't investigated much about the different string crates, I found out about
Since |
Thank for separating this out! |
Do you mind either merging |
I think you already have edit privileges on this branch |
Ah right, thank you, it was an HTTPS vs. SSH thing. |
f06094b
to
ee373e3
Compare
It looks like there is almost no difference in the micro-benchmarks? Is this change still worth doing? |
Tok::Name { | ||
name: name.to_owned(), | ||
} | ||
Tok::Name { name: name.into() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tok::Name { name: name.into() } | |
Tok::Name { name: ComactString::new_inline(name) } |
That's the observation I made as well when I did this change a while back #6174 I wouldn't expect any changes for all benchmarks other than the lexer because our (old) parser allocates way too much, so any other improvements are lost in the noise. I'm surprised that the lexer benchmarks are regressing. But the microbenchmarks might just be too short and exercise the exact same memory pattern for each run, allowing the allocator to re-use the allocations cheaply. Or the change indeed is not as impactful as we thought. |
Using
|
Copying this here from the Discord conversation with @LaBatata101
@LaBatata101 do you want to analyze the memory consumption or should we move forward without the small string optimisation for now (and close our both PRs?) |
I think it's better to close both of our PRs for now, and do an in-depth analysis after the new parser is merged. |
Summary
Changes the type of the
id
field in theIdentifier
andExprName
nodes fromString
toSmolStr
. This is a required change for the new parser.This PR is part of #9152
Test Plan
cargo test --lib