-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement automatic decompression of responses #125
Conversation
These tests are currently broken, but should succeed once decompression works. They should also catch a couple error cases, although it's unclear if the behavior I'm testing for in the case that someone sends a bad response (one marked as compressed that actually isn't) is ideal. These will probably need to be extended to support streaming / async cases in the future.
Doing this as an extension on the request avoids the need to either (a) manufacture a fake header, or (b) add a new field to the Session type. Thanks to @aturon for the pointer!
…he fly. I don't love the buffering behavior of this, and it almost certainly does a completely unnecessary copy that should be easy to remove; that being said, this is where I got to before I closed up for the night.
lib/src/session.rs
Outdated
@@ -658,6 +658,9 @@ impl Session { | |||
} | |||
} | |||
|
|||
#[derive(Clone, Copy, Eq, Hash, PartialEq)] | |||
pub struct AutoDecompressResponse {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a cool way to represent the flag, but there is a drawback for the future: we'll need to add additional metadata to requests and responses to keep up with C@E proper. (I'm doing some work that expands those out right now).
I think we should consider more closely following C@E's approach of introducing FastlyRequestMetadata
/FastlyResponseMetadata
structs with fields representing the various knobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 7aa1e6d ! Except that I called it ViceroyRequestMetadata
.
lib/src/body.rs
Outdated
} | ||
Poll::Ready(Some(item)) => { | ||
// put the body back, so we can poll it again next time | ||
self.chunks.push_front(body.into()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to push the entire CompressedHttpBody
chunk to the front of the queue, to continue reading into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes. Good catch. Should be addressed in 88146fb.
use fastly::{Backend, Request}; | ||
use fastly::http::request::SendError; | ||
|
||
static HELLO_WORLD: &'static str = include_str!("../../data/hello_world"); | ||
static HELLO_WORLD_GZ: &'static [u8] = include_bytes!("../../data/hello_world.gz"); | ||
|
||
fn main() -> Result<(), SendError> { | ||
let echo_server = Backend::from_name("echo").expect("Could not find echo backend?"); | ||
|
||
// Test framework sanity check: a request without the auto_decompress flag | ||
// should bounce back to us unchanged. | ||
let standard_echo = Request::put("http://127.0.0.1:9000") | ||
.with_header("Content-Encoding", "gzip") | ||
.with_body_octet_stream(HELLO_WORLD_GZ) | ||
.send(echo_server.clone())?; | ||
|
||
assert_eq!(standard_echo.get_header_str("Content-Encoding"), Some("gzip")); | ||
assert_eq!(standard_echo.get_content_length(), Some(HELLO_WORLD_GZ.len())); | ||
|
||
// Similarly, if we set the auto_decompress flag to false, it should also | ||
// bounce back to us unchanged. | ||
let explicit_no = Request::put("http://127.0.0.1:9000") | ||
.with_header("Content-Encoding", "gzip") | ||
.with_body_octet_stream(HELLO_WORLD_GZ) | ||
.with_auto_decompress_gzip(false) | ||
.send(echo_server.clone())?; | ||
|
||
assert_eq!(explicit_no.get_header_str("Content-Encoding"), Some("gzip")); | ||
assert_eq!(explicit_no.get_content_length(), Some(HELLO_WORLD_GZ.len())); | ||
|
||
// But if we set the auto_decompress flag to true, and send a compressed | ||
// file, we should get the uncompressed version back | ||
let mut unpacked_echo = Request::put("http://127.0.0.1:9000") | ||
.with_header("Content-Encoding", "gzip") | ||
.with_body_octet_stream(HELLO_WORLD_GZ) | ||
.with_auto_decompress_gzip(true) | ||
.send(echo_server.clone())?; | ||
|
||
assert!(unpacked_echo.get_header("Content-Encoding").is_none()); | ||
assert!(unpacked_echo.get_content_length().is_none()); | ||
let hopefully_unpacked = unpacked_echo.take_body_str(); | ||
assert_eq!(HELLO_WORLD, &hopefully_unpacked); | ||
|
||
// That being said, if we set the flag to true and send it a text file, | ||
// we should just get it back unchanged. | ||
let mut yes_but_uncompressed = Request::put("http://127.0.0.1:9000") | ||
.with_body_octet_stream(HELLO_WORLD.as_bytes()) | ||
.with_auto_decompress_gzip(true) | ||
.send(echo_server.clone())?; | ||
|
||
let still_unpacked = yes_but_uncompressed.take_body_str(); | ||
assert_eq!(HELLO_WORLD, &still_unpacked); | ||
|
||
// A slightly odder case: We set everything up for unpacking, but we | ||
// don't actually send a gzip'd file. In this case, we're going to say | ||
// that we should get back exactly the input, even with the bogus | ||
// content encoding. | ||
let bad_gzip = Request::put("http://127.0.0.1:9000") | ||
.with_header("Content-Encoding", "gzip") | ||
.with_body_octet_stream(HELLO_WORLD.as_bytes()) | ||
.with_auto_decompress_gzip(true) | ||
.send(echo_server.clone())?; | ||
|
||
assert_eq!(bad_gzip.get_header_str("Content-Encoding"), Some("gzip")); | ||
assert_eq!(bad_gzip.get_content_length(), Some(HELLO_WORLD.len())); | ||
|
||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is lovely! As we mentioned yesterday, we'll want to expand out with some uses of append
operations and streaming as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This appears to be a bit more standardized, and provides a nicer way to carve off early chunks, as well. Based on testing, it's possible that this will consistently generate an empty chunk at the end of the decompression run, but that should be managed correctly by other parts of the system.
This is relatively straightforward, but more thought should go into the error case in which someone marks data as gzip-compressed but it isn't. The system remains whole after this, but the observable behavior is odd in that it gets marked as StatusCode::OK but taking the body leads to uncatchable failures (I think).
…`hyper::Body`. A bit of an oops on the previous implementations. This requires slightly more careful handling of lifetimes to avoid copying. It also only puts the current subchunk back into the queue in the non-error cases, as opposed to previous versions which would put error cases back up to get more errors.
…nsion data. Other folks should be able to just pile on to this structure to add additional information in the future, without much pain, and without having to add a new type for every item. There's a benign clippy warning in `req_impl.rs` that'll be nicely resolved when someone does so, too.
This test shouldn't actually be very interesting, and wasn't. Yay! But, in the future, if `send`, `send_async`, and `send_async_streaming` stop going through the same output code path and someone forgets to move the decompression code with it, we should be able to detect that.
Just in case there was a funny code path in there, or one gets introduced, this makes sure that you can properly read your compressed data from the client side, as well.
This reverts commit 6ec48e7.
I don't actually love this, but it's probably the least-bad alternative for dealing with differences of opinions about the structure of end-of-line markers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking really good! Just a couple minor nits!
lib/src/upstream.rs
Outdated
&& basic_response.headers().get(header::CONTENT_ENCODING) | ||
== Some(&HeaderValue::from_static("gzip")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also allow for x-gzip
per https://httpwg.org/http-core/draft-ietf-httpbis-semantics-latest.html#gzip.coding (and the C@E implementation).
lib/src/wiggle_abi/req_impl.rs
Outdated
// NOTE: We're going to hide this flag in the headers of the request in order to decrease | ||
// the book-keeping burden inside Session; there already existed a cleaning pass in the | ||
// output chain, which we extend for this use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment is stale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic work!
Fixes #124.
The flag marking this use case is held inside the extensions for the request, in the
ViceroyRequestMetadata
structure -- said structure can be extended in the future for other use cases -- and then queried in the final parts ofsend_request
insrc/upstream.rs
. The test cases work through an echo server, which just bounces the request right back, which seemed the easiest way to get the various test cases back.The error test case -- in which the response comes backed marked as gzip-encoded but isn't -- matches the implementation in C@E.