-
Notifications
You must be signed in to change notification settings - Fork 5.2k
xDS: ext_proc: add GRPC body send mode #38753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
9cd388d
c30e220
ed81c37
b30a64b
2800967
42bd600
532f526
5166f35
ace4e47
8c307bc
bc9a0ad
975e849
e8afc4b
532c31d
cd3b71f
4b7f6f5
e7cbd4d
aa8d3cc
6932eb6
0015fbd
708120f
8b95c52
9186b12
51224c4
d2334a8
ced8f28
28c5bd6
af6aa49
ec538a3
4fa776a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -164,7 +164,7 @@ message ProcessingRequest { | |
| // the server must send back exactly one ProcessingResponse message. | ||
| // * If it is set to ``FULL_DUPLEX_STREAMED``, the server must follow the API defined | ||
| // for this mode to send the ProcessingResponse messages. | ||
| // [#next-free-field: 11] | ||
| // [#next-free-field: 12] | ||
| message ProcessingResponse { | ||
| // The response type that is sent by the server. | ||
| oneof response { | ||
|
|
@@ -224,6 +224,19 @@ message ProcessingResponse { | |
| // is set to true. | ||
| envoy.extensions.filters.http.ext_proc.v3.ProcessingMode mode_override = 9; | ||
|
|
||
| // [#not-implemented-hide:] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Envoy dataplane has this planned, and is still working on the details. The behavior might be slightly different from gRPC client.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good to know. I think we should handle this the same way for both FULL_DUPLEX_STREAMED and GRPC modes, so if you have a slightly different approach in mind, I'm happy to discuss changes here. |
||
| // Used only in ``FULL_DUPLEX_STREAMED`` and ``GRPC`` body send modes. | ||
| // Instructs the data plane to stop sending body data and to send a | ||
| // half-close on the ext_proc stream. The ext_proc server should then echo | ||
| // back all subsequent body contents as-is until it sees the client's | ||
| // half-close, at which point the ext_proc server can terminate the stream | ||
| // with an OK status. This provides a safe way for the ext_proc server | ||
| // to indicate that it does not need to see the rest of the stream; | ||
| // without this, the ext_proc server could not terminate the stream | ||
| // early, because it would wind up dropping any body contents that the | ||
| // client had already sent before it saw the ext_proc stream termination. | ||
| bool request_drain = 11; | ||
|
|
||
| // When ext_proc server receives a request message, in case it needs more | ||
| // time to process the message, it sends back a ProcessingResponse message | ||
| // with a new timeout value. When the data plane receives this response | ||
|
|
@@ -268,11 +281,27 @@ message HttpHeaders { | |
| message HttpBody { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This message is named HttpBody, but it does not contain any bytes or metadata related to HTTP in GRPC mode. It seems as though we should instead have a message like the following: This would also avoid the pitfall of a GRPC specific boolean in the HttpBody message.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even in the new As I mentioned in another comment thread, I agree that the More generally, I do agree that there's a lot of tech debt in this API, and I wish it had been thought out better from the get-go rather than just evolving in place over time. But I don't think that's something we can fix at this point without giving up and creating a whole new API, which would fracture the ecosystem.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adding an Any means that somewhere the Envoy ext_proc filter gRPC parsing code needs to be able to build the proto in question, which can only happen if that proto definition is linked into the Envoy. Rather than
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah sorry I responded to Matt without seeing Mark's response. Basically agree with Mark's comment. |
||
| // The contents of the body in the HTTP request/response. Note that in | ||
| // streaming mode multiple ``HttpBody`` messages may be sent. | ||
| // | ||
| // In ``GRPC`` body send mode, a separate ``HttpBody`` message will be | ||
| // sent for each message in the gRPC stream. | ||
| bytes body = 1; | ||
|
|
||
| // If ``true``, this will be the last ``HttpBody`` message that will be sent and no | ||
| // trailers will be sent for the current request/response. | ||
| bool end_of_stream = 2; | ||
|
|
||
| // This field is used in ``GRPC`` body send mode when ``end_of_stream`` is | ||
| // true and ``body`` is empty. Those values would normally indicate an | ||
| // empty message on the stream with the end-of-stream bit set. | ||
| // However, if the half-close happens after the last message on the | ||
| // stream was already sent, then this field will be true to indicate an | ||
| // end-of-stream with *no* message (as opposed to an empty message). | ||
| bool end_of_stream_without_message = 3; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This makes the protocol very complicated with Envoy and gRPC are doing different things. Can we have a separate API for gRPC?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All this is really doing is indicating the difference between unset and empty for the I don't see why this field specifically adds a lot of complexity. I agree that there is a bit of work that needs to happen to support the Note that there is no proposal here for Envoy and gRPC to do different things. This PR is about how to handle gRPC traffic, regardless of which data plane is handling that traffic. We would want Envoy to support this body send mode as well, since Envoy does often handle gRPC traffic. I think that inventing a completely new ext_proc API for gRPC traffic doesn't make sense. The goal here is to leverage the existing ext_proc API, not build a competitor to it.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In gRPC mode is empty body with end_of_stream == true distinct from empty body with end_of_stream_without_message == true? If so, how?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In gRPC, if the client may send a half-close without sending a message, or long after the last message was sent. On the wire, that's represented as an HTTP/2 DATA frame that is empty but has the EOS bit set. But note that an empty HTTP/2 DATA frame is not the same as sending an empty gRPC message -- if there was another message sent, even if that message was empty, the empty message would still be framed within the HTTP/2 DATA frame. In other words, there are 3 possible cases for a DATA frame with the EOS bit set:
Note that on the wire between gRPC client and server, the difference between cases 1 and 2 is whether the gRPC framing is present. However, in GRPC send mode, we are not including the gRPC framing when we send the message to the ext_proc server; we're sending only the deframed messages. This means that we need a way to differentiate between cases 1 and 2, which is what this field provides. |
||
|
|
||
| // This field is used in ``GRPC`` body send mode to indicate whether | ||
| // the message is compressed. This will never be set to true by gRPC | ||
| // but may be set to true by a proxy like Envoy. | ||
| bool grpc_message_compressed = 4; | ||
| } | ||
|
|
||
| // This message is sent to the external server when the HTTP request and | ||
|
|
@@ -331,6 +360,8 @@ message CommonResponse { | |
| // | ||
| // In other words, this response makes it possible to turn an HTTP GET | ||
| // into a POST, PUT, or PATCH. | ||
| // | ||
| // Not supported if the body send mode is ``GRPC``. | ||
| CONTINUE_AND_REPLACE = 1; | ||
| } | ||
|
|
||
|
|
@@ -415,15 +446,34 @@ message HeaderMutation { | |
| repeated string remove_headers = 2; | ||
| } | ||
|
|
||
| // The body response message corresponding to FULL_DUPLEX_STREAMED body mode. | ||
| // The body response message corresponding to ``FULL_DUPLEX_STREAMED`` or ``GRPC`` body modes. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Each field in this message now has a comment above it to indicate that it means semantically different things based on a field contained in a different message. This is saying to me that it belongs as a different message in a oneof at a higher level.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think they're really all that semantically different. The If I created another message here, it would look almost identical to this one, and I think that would cause more confusion than it would provide benefit at this point. |
||
| message StreamedBodyResponse { | ||
| // The body response chunk that will be passed to the upstream/downstream by the data plane. | ||
| // In ``FULL_DUPLEX_STREAMED`` body send mode, contains the body response chunk that will be | ||
| // passed to the upstream/downstream by the data plane. In ``GRPC`` body send mode, contains | ||
| // a serialized gRPC message to be passed to the upstream/downstream by the data plane. | ||
| bytes body = 1; | ||
|
|
||
| // The server sets this flag to true if it has received a body request with | ||
| // :ref:`end_of_stream <envoy_v3_api_field_service.ext_proc.v3.HttpBody.end_of_stream>` set to true, | ||
| // and this is the last chunk of body responses. | ||
| // Note that in ``GRPC`` body send mode, this allows the ext_proc | ||
| // server to tell the data plane to send a half close after a client | ||
| // message, which will result in discarding any other messages sent by | ||
| // the client application. | ||
| bool end_of_stream = 2; | ||
|
|
||
| // This field is used in ``GRPC`` body send mode when ``end_of_stream`` is | ||
| // true and ``body`` is empty. Those values would normally indicate an | ||
| // empty message on the stream with the end-of-stream bit set. | ||
| // However, if the half-close happens after the last message on the | ||
| // stream was already sent, then this field will be true to indicate an | ||
| // end-of-stream with *no* message (as opposed to an empty message). | ||
| bool end_of_stream_without_message = 3; | ||
|
|
||
| // This field is used in ``GRPC`` body send mode to indicate whether | ||
| // the message is compressed. This will never be set to true by gRPC | ||
| // but may be set to true by a proxy like Envoy. | ||
| bool grpc_message_compressed = 4; | ||
| } | ||
|
|
||
| // This message specifies the body mutation the server sends to the data plane. | ||
|
|
@@ -433,19 +483,19 @@ message BodyMutation { | |
| // The entire body to replace. | ||
| // Should only be used when the corresponding ``BodySendMode`` in the | ||
| // :ref:`processing_mode <envoy_v3_api_field_extensions.filters.http.ext_proc.v3.ExternalProcessor.processing_mode>` | ||
| // is not set to ``FULL_DUPLEX_STREAMED``. | ||
| // is not set to ``FULL_DUPLEX_STREAMED`` or ``GRPC``. | ||
| bytes body = 1; | ||
|
|
||
| // Clear the corresponding body chunk. | ||
| // Should only be used when the corresponding ``BodySendMode`` in the | ||
| // :ref:`processing_mode <envoy_v3_api_field_extensions.filters.http.ext_proc.v3.ExternalProcessor.processing_mode>` | ||
| // is not set to ``FULL_DUPLEX_STREAMED``. | ||
| // is not set to ``FULL_DUPLEX_STREAMED`` or ``GRPC``. | ||
| // Clear the corresponding body chunk. | ||
| bool clear_body = 2; | ||
|
|
||
| // Must be used when the corresponding ``BodySendMode`` in the | ||
| // :ref:`processing_mode <envoy_v3_api_field_extensions.filters.http.ext_proc.v3.ExternalProcessor.processing_mode>` | ||
| // is set to ``FULL_DUPLEX_STREAMED``. | ||
| // is set to ``FULL_DUPLEX_STREAMED`` or ``GRPC``. | ||
| StreamedBodyResponse streamed_response = 3 | ||
| [(xds.annotations.v3.field_status).work_in_progress = true]; | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This protocol mode does not seem to define a new behavior but rather a semantic interpretation of content of the body bytes. An ext_proc client that sends a
bodycontaining a complete gRPC message using the GRPC mode and any other mode is indistinguishable to the ext_proc server - i.e. they will all be processed the same way.In the end I think this change complicates the already complicated and confusing protocol even more without offering a tangible benefit. More over server implementations still have to support gRPC messages split over multiple ext_proc messages since this is an existing behavior.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not actually true -- the contents sent to the ext_proc server are actually different under this mode, so the ext_proc server can absolutely tell the difference.
For gRPC traffic, the content of the HTTP/2 DATA frames is a sequence of framed gRPC messages. However, there is no guarantee that each DATA frame contains exactly one framed gRPC message; it's entirely possible to have a single framed gRPC message span multiple DATA frames, and it's also entirely possible to have multiple framed gRPC messages within a single DATA frame. The goal of the new
GRPCbody send mode is to handle the buffering and deframing in the data plane instead of making every ext_proc server handle it itself.In an existing body send mode like
FULL_DUPLEX_STREAMED, every ext_proc server will need to handle buffering and deframing of gRPC messages as it receives DATA frame chunks. In the newGRPCmode, the ext_proc server sees only the deframed gRPC messages, and it always gets exactly one deframed gRPC message in each body message on the ext_proc stream.In addition, note that gRPC cannot implement any other mode here, because our filters do not have access to the raw HTTP/2 DATA frames. In gRPC, we handle deframing the gRPC messages in our transport layer (i.e., the same place where we handle the HTTP/2 protocol), and our filters see only the deframed gRPC messages. The only way we could implement the current
FULL_DUPLEX_STREAMEDprotocol would be some horrible hack where we'd need to re-add the HTTP/2 framing to each message before we send it to the ext_proc server, which really doesn't make sense -- and it would probably hurt performance by requiring additional memory allocations. And since we want to support the ability for users to switch between proxy-based setups and proxyless gRPC, we need some common protocol that both data planes can support for gRPC traffic.So basically, there are two arguments for this new mode:
I think there's a clear path to removing the need for that in existing servers:
GRPCmode in the ext_proc server. (Note that the ext_proc request already tells the server what body send mode the client is using, so during this migration, the server can tell how it's supposed to handle each ext_proc stream.)GRPCmode.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
BodySendModedescribes behavior of the state machine for exchanging body content with the server. You are describing two things simultaneously with this new value: what contains in the body payload and (possibly) new state transitions. The latter part is not described here. Does this value imply that the state machine is the same as STREAMED but with payload of gRPC messages? Or is it the same as FULL_DUPLEX_STREAMED? Or something new entirely? And why can;t I use BUFFERED state machine if I know I will only have unary gRPC messages?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The state machine is essentially the same as FULL_DUPLEX_STREAMED, just with the addition of the buffering and deframing of gRPC messages. In other words, the data plane will buffer only until it sees a complete gRPC message, at which point it will deframe the message and stream it to the ext_proc server. And the body responses sent back from the ext_proc server will each be an individual gRPC message as well.
I don't think it makes sense to use BUFFERED mode here. Note that the gRPC wire protocol is designed to handle streaming as a first-class citizen, and unary RPCs are just a special case: a unary RPC is a bidi stream that just happens to have a single request followed by a single result. Inside the HTTP/2 DATA frames, the request and response messages are still encoded with the same gRPC framing as in streaming cases. So using BUFFERED mode would still require the data plane to send the raw contents of the HTTP/2 DATA frames, and it would require the ext_proc server to handle the deframing, which is what we're trying to avoid here. (I don't think we want to add a BUFFERED_GRPC mode to make it handle the framing, and even if we did, the behavior would wind up being the same as the GRPC mode proposed here, so I think that would add complexity without actually providing any benefit.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markdroth can gRPC client just re-use FULL_DUPLEX_STREAMED mode then. The gRPC client implementation can do the buffering and deframing before sending the message to the gRPC ext_proc server, which is the gRPC client implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my expectation is that Envoy would also implement GRPC mode. Users that want to use ext_proc for gRPC traffic would want to migrate to this mode if they need to use proxyless gRPC.
If the goal is to implement compression on a per-message basis, which is the way compression is normally done for gRPC, then it should be totally possible in
GRPCmode.If the goal is to implement compression for the entire HTTP/2 stream, even compressing the gRPC framing within the HTTP/2 DATA frames, then that's not something gRPC will be able to support in the first place, regardless of how we structure the configuration. In gRPC, we deal with the gRPC framing in our transport layer, and the xDS HTTP filters are above that, so the ext_proc filter will see only the deframed gRPC messages. It's not possible for the filter to replace the raw HTTP/2 DATA frames, since it simply doesn't operate at that layer.
Given gRPC's architecture, I don't think there's actually anything that the ext_proc server could do that we could actually support that it won't be able to do with
GRPCmode.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I will note that this is yet another reason why it makes sense for gRPC to not support
FULL_DUPLEX_STREAMED, because that may fool ext_proc server authors into thinking that they can do things that gRPC simply can't support.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking more and looking at other ext_proc implementations I think it would be preferable to use a separate field to indicate content and leave body sending mode as is. Proxyless gRPC would then use content_type: GRPC and ...body_mode: FULL_DUPLEX_STREAMED. Proxyless gRPC does need to implement other modes like BUFFERED or STREAMED.
The reason is we want to add support for other content types such as JSON-RPC and we want to be able to send them in all modes, most importantly in BUFFERED mode, since some ext_proc servers only support this mode and it will delay adoption if we have to force implementations to use FULL_DUPLEX_STREAMED to be able to process other content types.
If we put the GRPC content as the send mode then we have to add content-type for JSON-RPC anyway and then have to explain that it works in all cases except GRPC mode, which is not great.
Using content-type; GRPC and ..._body_mode: FULL_DUPLEX_STREAMED will work exactly the same way as GRPC body send mode without limiting implementation from using other send modes either for GRPC or for other protocols we are adding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the desire to support other content-aware modes like JSON-RPC, and I agree that that should be possible. So let's talk about the best way to do that.
It's not clear to me that introducing a separate content type setting actually provides a better story for doing that, because it's not actually an independent dimension from the body send mode. The content type would actually directly affect the behavior of the body send mode -- e.g., the exact buffering behavior of FULL_DUPLEX_STREAMED will be different depending on whether the content type is GRPC -- which means that both code and humans will have to look at both settings in order to understand the behavior.
Specifically:
Given that these two things are really just a single dimension, I think it would make things much easier for the configuration to reflect that. I would suggest instead defining new body send modes for each actual body send behavior, including those that are content-aware. In other words:
I realize that this approach will wind up with a larger number of body send modes, but I think it will ultimately make it easier to understand and implement.
To be clear, I think we are going to ultimately support the same set of modes either way; the question here is only about which configuration structure winds up being easier to understand and which adds more complexity. If there's a strong argument for doing this as a separate content type knob, we can make that work, but it seems worse to me in terms of complexity and understandability.
Thoughts...?
As an aside, it's still not clear to me that it makes sense to support BUFFERED mode for gRPC traffic, for the following reasons:
That having been said, we don't need to decide this right now, as long as we leave room for the possibility in the future. I think that if we rename the GRPC mode to FULL_DUPLEX_STREAMED_GRPC, it will leave an easy way to add support for a BUFFERED_GRPC mode in the future if we decide to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can it be independent? The bytes that ext_proc filter sends to the server is opaque to it, there is no specific meaning attached to them today. Today it is implied to be a portion (not even corresponding to H/2 DATA or H/1 chunks) of HTTP body because no other other protocols are used with ext_proc. This can be made explicit with additional content-type field.
We'd like to make ext_proc and other filters work generically with the protocols encapsulated within HTTP. I think we can make this work by adding ambient frame information to the request that ext_proc can use to generically support sending and receiving of encapsulated protocols. We have at least 2 in the pipeline: GRPC, JSON-RPC/MCP. WebSocket is also being asked for, but this is less definitive, so we want to give developers a flexible way to add support for encapsulated protocols without having to also modify ext_proc protocol.