-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(lambda-http): accept http_body::Body in responses #491
Conversation
`cross` didn't work - zigbuild works as a replacement additonal change: match the layer name with the executable
Thanks a lot for looking into this @hugobast ! I'm going to take a look at that extra commit to see if I find a way to make the verification work automatically, but the changes look good at first glance. I will run some tests this week. @david-perez, @bnusunny can you take a look at this PR too? |
Co-authored-by: David Calavera <david.calavera@gmail.com>
lambda-http/src/response.rs
Outdated
// Can safely unwrap here as we do type validation in the 'if' statement | ||
Box::pin(ready(*any_self.downcast::<Body>().unwrap())) | ||
} else { | ||
Box::pin(async move { Body::from(to_bytes(self).await.expect("unable to read bytes from body").to_vec()) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've experimented with this patch and the more I think about this issue, the more I think that it's impossible for aws-lambda-rust-runtime
to provide a response body conversion that will work for all cases.
The issue I have with this line in particular is that it's effectively saying "if the user's tower::Service
has a Response
whose Body
is not of type lambda_http::Body
, then convert the response body to_bytes, then to Vec<u8>
, and use lambda_http::Body::from
to yield a response whose body is of the lambda_http::Body::Binary
variant".
This is problematic, because lambda_http::Body::Binary
is signalling to API Gateway that the response contains binary data, and API Gateway does all kinds of fun stuff before relaying the reponse to the client depending on several factors, including whether we signalled our payload contains binary data or not. For example, it might base64 encode the body before returning the response.
I deployed some Lambdas and was able to reproduce clearly undersirable behavior when the service returns a response body with Data = Bytes
that contains a UTF-8 encoded string. Now, one might argue that the user should then use Data = String
, or have their Response
type be String
directly (which would leverage the impl IntoResponse for String
in response.rs
, that correctly uses lambda_http::Body::Text
). The problem is the user might not know or be in control of what body type they're returning because they are relying on a library. For example, Axum uses axum_core::body::BoxBody
as the unified response body type of all the routes, regardless of what data type your handler returns. BoxBody
is a type-erased boxed body type that holds Bytes
.
Here is an example repository that reproduces the problem: https://github.com/david-perez/lambda-pr-491. It has instructions to deploy a Rust Lambda function fronted by both a HTTP API Gateway and a REST API Gateway using CDK.
The Lambda function uses Axum and contains a route that returns JSON:
#[tokio::main]
async fn main() -> Result<(), Error> {
let lambda_handler = Router::new().route("/prod/hello", get(hello));
lambda_http::run(lambda_handler)
.await
.expect("something went wrong in the Lambda runtime");
Ok(())
}
async fn hello() -> Json<Value> {
Json(json!({ "message": "hello stranger!"}))
}
As explained above, this setup makes aws-lambda-rust-runtime
return a lambda_http::Body::Binary
body to the internal AWS Lambda service, signalling a binary payload.
When the response gets sent to the client, it seems like the behavior of API Gateway is dfferent depending on whether the API type is HTTP or REST.
$ curl https://<http-api-id>.execute-api.eu-west-1.amazonaws.com/prod/hello
{"message":"hello stranger!"}%
$ curl https://<rest-api-id>.execute-api.eu-west-1.amazonaws.com/prod/hello
eyJtZXNzYWdlIjoiaGVsbG8gc3RyYW5nZXIhIn0=%
The HTTP API works as we would expect, but the the REST API has "helpfully" base64 encoded the JSON text. In both cases the response has the Content-Type: application/json
header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll continue with the information you've provided, I wasn't able to get to the core of the issue and was focusing too much on the issues I had with coming up with a test setup for the response conversion table, what you provided is straightforward and really helpful. Thank you, David!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@david-perez After a few hours of playing with this, the problem as I see it is that axum::body::BoxBody
is a downcast (is that the correct term?) of http_body::Body
and I don't see how we can get anything more out of it than the bytes it's holding. In other words, is it correct to say that it is essentially "just bytes" at this point and we have no choice but the Body::Binary
variant?
That said I had an idea maybe this is what we are looking for
Box::pin(async move { Body::from(to_bytes(self).await.expect("unable to read bytes from body").to_vec()) }) | |
Box::pin(async move { | |
let bytes = to_bytes(self).await.expect("unable to read bytes from body"); | |
match from_utf8(&bytes) { | |
Ok(s) => Body::from(s), | |
Err(_) => Body::from(bytes.to_vec()), | |
} | |
}) |
To summarize the code we are commenting on
lambda_http::Body
pass through as they are- Other
bytes::Bytes
types (of whichBoxBody
is)
a. utf8 from bytes ->Body::Text
b. if not ->Body::Binary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hugobast That's a neat idea. Unfortunately, the fact that the payload is valid UTF-8 does not imply that its contents are text. A priori the user could be sending a PNG that is valid UTF-8 text, for example.
Looking at the Content-Type
header of the response is another approach. But again, it's not an infallible solution. The user might not have set the header. It's also easy to map application/json
or text/plain
to Body::Text
, but we can't possibly handle all standard content types, let alone user-defined custom ones.
There are no "one size fits all" solutions unfortunately. All in all, this is why I think it's impossible to land this feature. We can't guess the user's intention. I think any kind of solution will have to ultimately explicitly involve the user communicating his intention to the runtime. Inserting an http::Extensions
in the response that the runtime can then read, for example. That will severely limit the user though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think any kind of solution will have to ultimately explicitly involve the user communicating his intention to the runtime
I don't think this is necessarily bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bnusunny I think your point needs to be documented in the runtime documentation, but we should not make users depend on specific configurations in another service. binaryMediaTypes are for example not available when you call a function with invoke
. People that use that should still be able to use http responses as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the rules, I would suggest two updates:
2.i.a If it is equal to text/*, application/json, application/javascript or application/xml, leave the body as is and set isBase64Encoded to false.
content-type
may have suffixes, such as 'application/json; charset=UTF-8'. So it is better to use:
2.i.a If the content-type
begins with text/*
, application/json
, application/javascript
or application/xml
, leave the body as is and set isBase64Encoded
to false
.
2.ii If the
content-type
header is not present, leave the body as is and setisBase64Encoded
tofalse
.
This is problematic. If a bad client sent a binary http response without the content-type
header, we would treate it as text and Lambda service would throw an error. We should default to binary here. ALB also uses binary as default.
2.ii If the content-type
header is not present, base64 encode the body and set isBase64Encoded
to true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bnusunny I think your point needs to be documented in the runtime documentation, but we should not make users depend on specific configurations in another service. binaryMediaTypes are for example not available when you call a function with
invoke
. People that use that should still be able to use http responses as expected.
Yes. We definitely need to document this configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.ii If the
content-type
header is not present, leave the body as is and setisBase64Encoded
tofalse
.
This is problematic. If a bad client sent a binary http response without the
content-type
header, we would treate it as text and Lambda service would throw an error.
However, if they send a text HTTP response without the content-type
header, the client sends e.g. Accept: text/plain
and the user has not configured contentHandling
(which they shouldn't need to, since they are returning text payloads) we would treat it as binary and return a base64-encoded response. That is, the client would receive eyJtZXNzYWdlIjoiaGVsbG8gc3RyYW5nZXIhIn0=
instead of {"message":"hello stranger!"}
.
I think APIs that return text are far more common than those that return binary payloads and as such we should default to text in this case where we don't have any information. I also prefer the Lambda service throwing a loud error than the client being puzzled at why they received base64-encoded text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Let's default to text.
By the way. lambda_http
crate only supports Lambda Proxy integration with API GW. contentHandling
doesn't apply to Lambda Proxy integration. APIGW Developer Guide has the details.
B: HttpBody + Unpin + 'static, | ||
B::Error: fmt::Debug, | ||
{ | ||
// assumes utf-8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed changes that should be close, at least in spirit, to what was talked about in the last few days in the other thread, I'm not that comfortable with Rust so please make suggestions if the code here is suboptimal.
I have concerns about encoding @david-perez @bnusunny @calavera. I'm guessing that most things will be utf-8 (JSON?) but that's far from saying that everything is utf-8.
@bnusunny commented with an example code that's using reqwest so I looked at how the text
fn is implemented on reqwest::Response
. We could use the same approach here and parse the charset from the content-type if we'd like I'll defer to your comments/opinions on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work @hugobast ! If there is a path to check the charset, it might be worth doing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's optional but here's a commit where encode based on the charset of the content-type, falling back to utf-8 in error cases: 2632aad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very good job handling these changes @hugobast, thanks a lot for your help!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran some tests, including the example in #438, and everything looks to be working as expected 👍
@calavera @bnusunny @david-perez is anything more needed here to get this merged? |
I know one of @bnusunny's concerns is whether we need to turn the response into async or not. It's a big non-backwards compatible change. When Nicolas implemented this originally, he was thinking about a future where Lambda could stream responses, but that might be better to put it into a separated PR when/if that happens. Here's the original conversation: |
Ah, let me back out of this answer it feels more tied up, considering that we use |
In my hasty response I'm afraid I won't get the point accross so I'll summarise here in a new comment. What I mean by this previous reply is reading a body is an asynchronous operation and that's why it looks like Nicolas made the response into a Future. |
@hugobast You are right. It is an asynchronous operation. Thanks very much for getting this PR done. |
Just adding this here: this is correct! 😄 |
Issue #, if available:
This PR is meant to continue the work started here: #466
Description of changes:
Original description:
Allows using any http_body::Body implementation in the response for lambda_http.
For context, the outstanding issue on the original PR stems from this comment: #466 (comment)
I picked up this PR and worked on it to the point where I believe that I had a viable testing approach. Ultimately I am blocked from performing the testing that has been requested in the original PR.
Here's why
In order to configure
contentHandling
we have to put the integration in theaws
(ie. notaws-proxy
) configuration.The request is treated as a
LambdaRequest::WebSocket
by the lambda_http crate, why this is and why the rust code panics when operated in this way, I don't know.On an
aws-proxy
configuration, binary formats are dealt with by encoding the body of the response as base64 and settingisBase64Encoded: true
(documentation: Working with binary media types for REST APIs - Amazon API Gateway)I managed to configure an API Gateway + Lambdas to be tested with CONVERT_TO_BINARY and CONVERT_TO_TEXT and it doesn't appear to work. What does work is setting the integration as
aws-proxy
but then it operates completely differently.If we test through API gateway we only have two type requests, one where the function returns a
Body::Binary
and one where it returns aBody::Text
(maybe a 3rd one forBody::Empty
).I'm proposing a simpler verification where we rely on the current unit tests (I'm adding impl IntoResponse for binaries + tests) and pasting the Lambda console results (below) that I got while debugging.
Happy to dig in further if I'm wrong and misunderstanding something.
Here's the console output for both a lambda that returns text and one that returns bytes
Bytes
Text
Finally, because I was most of the way there I want to share the commit I removed from this PR where I setup an API gateway (non-proxy) and the makefile changes where I would have added the verifications. hugobast@83fa77d
By submitting this pull request