-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Micro-optimize vec.with_c_str for short vectors #9352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -217,6 +224,45 @@ impl<'self> ToCStr for &'self [u8] { | |||
CString::new(buf as *libc::c_char, true) | |||
} | |||
} | |||
|
|||
/// WARNING: This function uses an optimization to only malloc a temporary | |||
/// CString when the source string is small. Do not save a reference to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You write small
but I think you mean large
, but does the doc have to spill the guts here? Either the small or the large string case, the CString or buffer is just a temporary and the pointer will be invalid after the closure exits, that's all the user should need to know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed it.
This is ready for review again. It also adds a couple other things. First, it bumps the size of the buffer to 128 bytes. Next, it adds the micro-optimization to |
@@ -145,7 +154,7 @@ pub trait ToCStr { | |||
fn to_c_str(&self) -> CString; | |||
|
|||
/// Unsafe variant of `to_c_str()` that doesn't check for nulls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still mentions Unsafe
. Also, could you expand the documentation to explain what happens when there is actually an interior null?
@cmr: I'm not yet ready to agree, but I'll factor out the |
|
||
fn with_c_str<T>(&self, f: &fn(*libc::c_char) -> T) -> T { | ||
if self.len() < BUF_LEN { | ||
do self.as_imm_buf |self_buf, self_len| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this avoid unsafe code and use:
let mut buf = ...;
vec::bytes::copy_memory(buf, self.as_bytes());
buf[self.len()] = 0;
do buf.as_imm_buf |buf, _| {
// etc
}
?
This now makes it unsafe to save the pointer returned by .with_c_str as that pointer now may be pointing at a stack allocated array. I arbitrarily chose 32 bytes as the length of the stack vector, and so it might not be the most optimal size. before: test c_str::bench::bench_with_c_str_long ... bench: 539 ns/iter (+/- 91) test c_str::bench::bench_with_c_str_medium ... bench: 97 ns/iter (+/- 2) test c_str::bench::bench_with_c_str_short ... bench: 70 ns/iter (+/- 5) after: test c_str::bench::bench_with_c_str_long ... bench: 542 ns/iter (+/- 13) test c_str::bench::bench_with_c_str_medium ... bench: 53 ns/iter (+/- 6) test c_str::bench::bench_with_c_str_short ... bench: 19 ns/iter (+/- 0)
The documentation for `.with_c_str()` already says that the pointer will be deallocated before returning from the function.
This matches @graydon's recommendation.
before: test c_str::bench::bench_with_c_str_unchecked_long ... bench: 361 ns/iter (+/- 9) test c_str::bench::bench_with_c_str_unchecked_medium ... bench: 75 ns/iter (+/- 2) test c_str::bench::bench_with_c_str_unchecked_short ... bench: 60 ns/iter (+/- 9) after: test c_str::bench::bench_with_c_str_unchecked_long ... bench: 362 ns/iter (+/- test c_str::bench::bench_with_c_str_unchecked_medium ... bench: 30 ns/iter (+/- 7) test c_str::bench::bench_with_c_str_unchecked_short ... bench: 12 ns/iter (+/- 4)
This also fixes a bug in `vec.with_c_str_unchecked` where we were not calling `.to_c_str_unchecked()` for long strings.
One of the downsides with `c_str` is that it always forces an allocation, and so this could add unnecessary overhead to various calls. This PR implements one of the suggestions @graydon made in #8296 for `vec.with_c_str` in that for a short string can use a small stack array instead of a malloced array for our temporary c string. This ends up being twice as fast for small strings. There are two things to consider before landing this commit though. First off, I arbitrarily picked the stack array as 32 bytes, and I'm not sure if this a reasonable amount or not. Second, there is a risk that someone can keep a reference to the interior stack pointer, which could cause mayhem if someone were to dereference the pointer. Since we also can easily grab and return interior pointers to vecs though, I don't think this is that much of an issue.
One of the downsides with
c_str
is that it always forces an allocation, and so this could add unnecessary overhead to various calls. This PR implements one of the suggestions @graydon made in #8296 forvec.with_c_str
in that for a short string can use a small stack array instead of a malloced array for our temporary c string. This ends up being twice as fast for small strings.There are two things to consider before landing this commit though. First off, I arbitrarily picked the stack array as 32 bytes, and I'm not sure if this a reasonable amount or not. Second, there is a risk that someone can keep a reference to the interior stack pointer, which could cause mayhem if someone were to dereference the pointer. Since we also can easily grab and return interior pointers to vecs though, I don't think this is that much of an issue.