-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to make Option<CompactStr> same size as Option<String>? #19
Comments
I think it is possible to do, but then Since it can store the string inline, the only field it can guarantee to not be |
Thanks for the issue! This is a great use case I hadn't thought of. At a bit level, we should be able to represent a distinct |
Why not bring the length byte to byte 0? If you disallow byte 0 to be ever 0, then you can use the NonZeroU8 rule to optimize the option. Then byte 0 having value of 1 means it is a heap case. Value 2 to 25 maps to length of 0 to 23. Value 26 onwards maps to UTF8 values for the longest inline character. Length byte at byte 0 is nice because getting the length byte does not need an address offset calculation. Perhaps this is a way to claw back lost performance. For the heap case, the 8 pad bytes would be the first 8 bytes, padded with value 1. |
But what if the string is empty? |
We don't necessarily need the niche value to be zero (i.e. The tricky part is getting the compiler to see the niche value, without losing any performance. Internally a Also FWIW the length byte is now the last byte because it allows the memcpys for an inline string to be better aligned, which had a significant (~35%) improvement on performance |
I tried again (I'll open a draft PR after some more experimentation) and, yeah, the compiler hates it
Note, though, that I'm testing in a noisy environment; the std_string bench is also seeing noise. |
Thanks! I was thinking using |
No |
How about the idea that the first 64 bit word for the heap case and short string are shared (non-union). Basically one single 64 bit memory loads the length byte, and the fist 7 byte values. Shouldn't 64 bit read & write beat memcpy? You can shift in multiples of 8 bits to access the byte values. (word64 >> (indx << 3)) as u8 You just have to map length 0 to a non-zero 64 bit word. Either the length mapping itself can be played around, or though a non-zero padding of the out of bound bytes. The heap string case would have length of 25 chars of longer, so 0 is not possible. |
See #75. The problem isn't extra copies, and the compiler/optimizer is really good at turning small copies into the efficient thing, you aren't going to beat it by changing between aligned writes and larger typed copies. The difference comes from opaque optimizer shenanigans, and seemingly unrelated changes are going to be what fix/diagnose it. My current bet is on these microbenchmarks are so sensitive that the noise we're seeing may even be down to machine code alignment (I know I've read other, better articles about this, at least one of which was attached to a tool to shuffle machine code around to diagnose if that is the compounding factor, but I can't find it right now.) |
Update fuzz.yml
Completed as part of #105 |
It would be very beneficial for me to reduce size of
Option<CompactStr>
. I use lots of it in my projects.I made some tests, and it looks like
Option<ComactStr>
takes additional 8 bytes in compaction toOption<String>
.Rust has null pointer optimization (see https://doc.rust-lang.org/std/option/index.html#representation) but I not sure if it can by used in userland crate.
sizeofs of popular Strings.
The text was updated successfully, but these errors were encountered: