-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added hexadecimal encoding module #8287
Conversation
|
||
/// A trait for converting hexadecimal encoded values | ||
pub trait FromHex { | ||
/// Converts the value of `self`, interpreted as base64 encoded data, into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
base64
typo
Overall I think this is fine to include, although we're starting to get a fair number of encodings into libextra. I think that go's organization is nice with an
And then all future encodings can be thrown into there as well. Another thing which I think would be nice is to have a unifying trait for encoding/decoding things to/from types. I'm not quite sure how that would look, but it'd be a shame if we had different traits for all the encodings. One possible route would be for things like json/xml to implement the Regardless, I think that we should get another opinion on whether this should be included in libextra (I'm in favor), and if there's any sort of reorganization of encodings it should happen on the mailing list/issue list and not here. |
+1 on an I don't think there's a reasonable trait for |
I did notice some very weird performance numbers from
|
Regarding the speed of to_hex, you can do something like the following
I tested some different options and this was 3x faster than the original using |
I don't have a compiler to check, but why can't one use |
|
I was thinking about this earlier today, and currently you provide a |
Sounds reasonable to me. I only had a |
@sfackler: yes |
@alexcrichton I'm prefer to distinguish serializers (JSON, ASN.1, etc.) from codecs (Base64, quoted-printable, etc.) |
@omasanori, that's a good point! Let's move discussion to #8310 @sfackler, this looks good to me, I would be willing to r+ it, but again someone else should probably chime in about whether we should start shoving lots of encoding formats into libextra |
FromHex ignores whitespace and parses either upper or lower case hex digits. ToHex outputs lower case hex digits with no whitespace. Unlike ToBase64, ToHex doesn't allow you to configure the output format. I don't feel that it's super useful in this case.
The overhead of str::push_char is high enough to cripple the performance of these two functions. I've switched them to build the output in a ~[u8] and then convert to a string later. Since we know exactly the bytes going into the vector, we can use the unsafe version to avoid the is_utf8 check. I could have riced it further with vec::raw::get, but it only added ~10MB/s so I didn't think it was worth it. ToHex is still ~30% slower than FromHex, which is puzzling. Before: ``` test base64::test::from_base64 ... bench: 1000 ns/iter (+/- 349) = 204 MB/s test base64::test::to_base64 ... bench: 2390 ns/iter (+/- 1130) = 63 MB/s ... test hex::tests::bench_from_hex ... bench: 884 ns/iter (+/- 220) = 341 MB/s test hex::tests::bench_to_hex ... bench: 2453 ns/iter (+/- 919) = 61 MB/s ``` After: ``` test base64::test::from_base64 ... bench: 1271 ns/iter (+/- 600) = 160 MB/s test base64::test::to_base64 ... bench: 759 ns/iter (+/- 286) = 198 MB/s ... test hex::tests::bench_from_hex ... bench: 875 ns/iter (+/- 377) = 345 MB/s test hex::tests::bench_to_hex ... bench: 593 ns/iter (+/- 240) = 254 MB/s ```
String NULL terminators are going away soon, so we may as well get rid of this now so it doesn't rot.
Encoding should really only be done from [u8]<->str. The extra convenience implementations don't really have a place, especially since they're so trivial. Also improved error messages in FromBase64.
@brson retry? |
FromHex ignores whitespace and parses either upper or lower case hex digits. ToHex outputs lower case hex digits with no whitespace. Unlike ToBase64, ToHex doesn't allow you to configure the output format. I don't feel that it's super useful in this case.
…5, r=Manishearth Erase late bound regions in `iter_not_returning_iterator` fixes rust-lang#8285 changelog: None
FromHex ignores whitespace and parses either upper or lower case hex
digits. ToHex outputs lower case hex digits with no whitespace. Unlike
ToBase64, ToHex doesn't allow you to configure the output format. I
don't feel that it's super useful in this case.