-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V2 + Deflate serialization support with flate2's stream wrappers #43
V2 + Deflate serialization support with flate2's stream wrappers #43
Conversation
// least to the user, but for now we'll only take an easily reversible step towards that. There are | ||
// still several ways the serializer interfaces could change to achieve better performance, so | ||
// committing to anything right now would be premature. | ||
trait TestOnlyHypotheticalSerializerInterface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This, and its cousin in the private test module, are regrettable, but I still have a handful of TODOs to explore in serialization so I don't want to commit to anything just yet. Plus, I suspect it will not be common that users want to polymorphically choose which serializer they use (if for no other reason than that this feature is so new...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd make me happy if we could at least somehow share this trait and its impl
s between the benchmarks and the tests though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I would like that as well but I couldn't think of a way to do it without making it pub
. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you may be able to (ab)use the #[path]
syntax to include a file with this trait + impl
s as a module in both...
.ok_or(DeserializeError::UsizeTypeTooSmall)?; | ||
|
||
// TODO reuse deflate buf, or switch to lower-level flate2::Decompress | ||
let mut deflate_reader = DeflateDecoder::new(reader.take(payload_len as u64)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocation :'(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it the allocation of the DefaultDecoder
you're saddened about? It's a relatively minor allocation, and it's not data-dependent, so I don't know that it'll add significant overhead..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but mostly for intellectual vanity reasons (and a little bit because I want to keep the way to no-heap open) I want serialization and deserialization to be allocation-free. :)
self.compressed_buf.write_u32::<BigEndian>(0)?; | ||
|
||
// TODO pluggable compressors? configurable compression levels? | ||
// TODO benchmark https://github.com/sile/libflate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So many TODOs! But none of them are errors, just things I want to explore.
src/serialization/serialization.rs
Outdated
@@ -86,6 +90,8 @@ | |||
//! fn serialize<S: Serializer>(&self, serializer: S) -> Result<(), ()> { | |||
//! // not optimal to not re-use the vec and serializer, but it'll work | |||
//! let mut vec = Vec::new(); | |||
//! // pick the format you want to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see this expanded on a little
@@ -176,13 +183,19 @@ mod benchmarks; | |||
mod v2_serializer; | |||
pub use self::v2_serializer::{V2Serializer, V2SerializeError}; | |||
|
|||
#[path = "v2_deflate_serializer.rs"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know that Rust, when confronted with mod foo
, will look for both foo/mod.rs
and ./foo.rs
by default, right? I don't think this #[path]
should be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd think that, and yet without this...
error: cannot declare a new module at this location
--> src/serialization/serialization.rs:186:1
|
186 | mod v2_deflate_serializer;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
= note: for more information, see issue #37872 <https://github.com/rust-lang/rust/issues/37872>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. Reading issue rust-lang/rust#37872, it seems like this is caused by us using #[path]
elsewhere, which sort "taints" the path such that everything needs #[path]
annotations. Weird... And unfortunate..
|
||
self.uncompressed_buf.clear(); | ||
self.compressed_buf.clear(); | ||
// TODO serialize directly into uncompressed_buf without the buffering inside v2_serializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, yeah, that's unfortunate :(
|
||
{ | ||
// TODO reuse deflate buf, or switch to lower-level flate2::Compress | ||
let mut compressor = DeflateEncoder::new(&mut self.compressed_buf, Compression::Default); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could speed up first-time serialization pretty substantially by resizing self.compressed_buf
here to some fraction of self.uncompressed_buf.len()
. This would avoid doing multiple Vec
resizes during the first compression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, why not. That would make the pessimal case of single-use serializers less bad!
LGTM! |
Performance isn't awful but it isn't great either. Compression is slow and there's not a lot we can do about that.