Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
9cd388d
xDS: ext_proc: add GRPC processing mode, and clean up docs
markdroth Mar 13, 2025
c30e220
Merge remote-tracking branch 'upstream/main' into ext_proc_grpc
markdroth Jul 10, 2025
ed81c37
add missing import
markdroth Jul 11, 2025
b30a64b
add missing build dep
markdroth Jul 11, 2025
2800967
spelling
markdroth Jul 11, 2025
42bd600
fix formatting
markdroth Jul 11, 2025
532f526
attempt to fix doc generation
markdroth Jul 11, 2025
5166f35
Merge remote-tracking branch 'upstream/main' into ext_proc_grpc
markdroth Sep 10, 2025
ace4e47
attempt to fix RST formatting
markdroth Sep 10, 2025
8c307bc
fix wording
markdroth Sep 10, 2025
bc9a0ad
clarify streaming behavior of GRPC body send mode
markdroth Sep 12, 2025
975e849
add comment about timeout behavior vs. body mode, and send_body_witho…
markdroth Sep 12, 2025
e8afc4b
try to fix formatting
markdroth Sep 12, 2025
532c31d
GRPC mode will support timers, and note that FULL_DUPLEX_STREAMED doe…
markdroth Sep 12, 2025
cd3b71f
[xDS] improve documentation for ext_proc
markdroth Sep 12, 2025
4b7f6f5
clarify wording
markdroth Sep 12, 2025
e7cbd4d
formatting and wording tweak
markdroth Sep 12, 2025
aa8d3cc
GRPC mode uses StreamedBodyResponse
markdroth Sep 12, 2025
6932eb6
clarify GRPC mode description, and note that message timeouts are not…
markdroth Sep 12, 2025
0015fbd
ext_proc server can inject EOS on gRPC client message
markdroth Sep 12, 2025
708120f
add a safe way for ext_proc server to close the stream with OK status
markdroth Sep 12, 2025
8b95c52
Update api/envoy/extensions/filters/http/ext_proc/v3/processing_mode.…
markdroth Sep 16, 2025
9186b12
clarify wording
markdroth Sep 16, 2025
51224c4
Merge branch 'ext_proc_doc_update' into ext_proc_grpc
markdroth Sep 16, 2025
d2334a8
add end_of_stream_without_message field
markdroth Sep 16, 2025
ced8f28
add end_of_stream_without_message in response also
markdroth Sep 16, 2025
28c5bd6
Merge remote-tracking branch 'upstream/main' into ext_proc_grpc
markdroth Sep 18, 2025
af6aa49
note that GRPC mode is similar to FULL_DUPLEX_STREAMED
markdroth Oct 2, 2025
ec538a3
fix space
markdroth Oct 2, 2025
4fa776a
add a bit to indicate whether gRPC messages are compressed
markdroth Oct 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions api/envoy/extensions/filters/http/ext_proc/v3/ext_proc.proto
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,10 @@ option (udpa.annotations.file_status).package_version_status = ACTIVE;
// * Request body: If the body is present, the behavior depends on the
// body send mode. In ``BUFFERED`` or ``BUFFERED_PARTIAL`` mode, the body is sent to the external
// processor in a single message. In ``STREAMED`` or ``FULL_DUPLEX_STREAMED`` mode, the body will
// be split across multiple messages sent to the external processor. In ``NONE`` mode, the body
// will not be sent to the external processor.
// be split across multiple messages sent to the external processor. In ``GRPC`` mode, as each
// gRPC message arrives, it will be sent to the external processor (there will be exactly one
// gRPC message in each message sent to the external processor). In ``NONE`` mode, the body will
// not be sent to the external processor.
// * Request trailers: Delivered if they are present and if the trailer mode is set
// to ``SEND``.
// * Response headers: Contains the headers from the HTTP response. Keep in mind
Expand Down Expand Up @@ -208,9 +210,9 @@ message ExternalProcessor {
// an error (subject to the processing mode) if the timer expires before a
// matching response is received. There is no timeout when the filter is
// running in observability mode or when the body send mode is
// ``FULL_DUPLEX_STREAMED``. Zero is a valid config which means the timer
// will be triggered immediately. If not configured, default is 200
// milliseconds.
// ``FULL_DUPLEX_STREAMED`` or ``GRPC``. Zero is a valid config which means
// the timer will be triggered immediately. If not configured, default is
// 200 milliseconds.
google.protobuf.Duration message_timeout = 7 [(validate.rules).duration = {
lte {seconds: 3600}
gte {}
Expand Down Expand Up @@ -272,8 +274,8 @@ message ExternalProcessor {
// without pausing on filter chain iteration. It is "Send and Go" mode that can be used
// by external processor to observe the request's data and status. In this mode:
//
// 1. Only ``STREAMED`` and ``NONE`` body processing modes are supported; for any other body
// processing mode, the body will not be sent.
// 1. Only ``STREAMED``, ``GRPC``, and ``NONE`` body processing modes are supported; for any
// other body processing mode, the body will not be sent.
//
// 2. External processor should not send back processing response, as any responses will be ignored.
// This also means that
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,25 @@ message ProcessingMode {
// * The client will stream the body chunks in the responses from the server to the upstream/downstream as they arrive.

FULL_DUPLEX_STREAMED = 4;

// [#not-implemented-hide:]
// A mode for gRPC traffic. This is similar to ``FULL_DUPLEX_STREAMED``,
// except that instead of sending raw chunks of the HTTP/2 DATA frames,
// the ext_proc client will de-frame the individual gRPC messages inside
// the HTTP/2 DATA frames, and as each message is de-framed, it will be
// sent to the ext_proc server as a :ref:`request_body
// <envoy_v3_api_field_service.ext_proc.v3.ProcessingRequest.request_body>`
// or :ref:`response_body
// <envoy_v3_api_field_service.ext_proc.v3.ProcessingRequest.response_body>`.
// The ext_proc server will stream back individual gRPC messages in the
// :ref:`StreamedBodyResponse <envoy_v3_api_msg_service.ext_proc.v3.StreamedBodyResponse>`
// field, but the number of messages sent by the ext_proc server
// does not need to equal the number of messages sent by the data
// plane. This allows the ext_proc server to change the number of
// messages sent on the stream.
// In this mode, the client will send body and trailers to the server as
// they arrive.
GRPC = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This protocol mode does not seem to define a new behavior but rather a semantic interpretation of content of the body bytes. An ext_proc client that sends a body containing a complete gRPC message using the GRPC mode and any other mode is indistinguishable to the ext_proc server - i.e. they will all be processed the same way.

In the end I think this change complicates the already complicated and confusing protocol even more without offering a tangible benefit. More over server implementations still have to support gRPC messages split over multiple ext_proc messages since this is an existing behavior.

Copy link
Contributor Author

@markdroth markdroth Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This protocol mode does not seem to define a new behavior but rather a semantic interpretation of content of the body bytes. An ext_proc client that sends a body containing a complete gRPC message using the GRPC mode and any other mode is indistinguishable to the ext_proc server - i.e. they will all be processed the same way.

That's not actually true -- the contents sent to the ext_proc server are actually different under this mode, so the ext_proc server can absolutely tell the difference.

For gRPC traffic, the content of the HTTP/2 DATA frames is a sequence of framed gRPC messages. However, there is no guarantee that each DATA frame contains exactly one framed gRPC message; it's entirely possible to have a single framed gRPC message span multiple DATA frames, and it's also entirely possible to have multiple framed gRPC messages within a single DATA frame. The goal of the new GRPC body send mode is to handle the buffering and deframing in the data plane instead of making every ext_proc server handle it itself.

In an existing body send mode like FULL_DUPLEX_STREAMED, every ext_proc server will need to handle buffering and deframing of gRPC messages as it receives DATA frame chunks. In the new GRPC mode, the ext_proc server sees only the deframed gRPC messages, and it always gets exactly one deframed gRPC message in each body message on the ext_proc stream.

In addition, note that gRPC cannot implement any other mode here, because our filters do not have access to the raw HTTP/2 DATA frames. In gRPC, we handle deframing the gRPC messages in our transport layer (i.e., the same place where we handle the HTTP/2 protocol), and our filters see only the deframed gRPC messages. The only way we could implement the current FULL_DUPLEX_STREAMED protocol would be some horrible hack where we'd need to re-add the HTTP/2 framing to each message before we send it to the ext_proc server, which really doesn't make sense -- and it would probably hurt performance by requiring additional memory allocations. And since we want to support the ability for users to switch between proxy-based setups and proxyless gRPC, we need some common protocol that both data planes can support for gRPC traffic.

So basically, there are two arguments for this new mode:

  1. We want to make things easier by having the gRPC buffering and framing handled in the data plane instead of in each individual ext_proc server.
  2. gRPC cannot support any other body send mode, and we want a common mode that works with both data planes for gRPC traffic.

server implementations still have to support gRPC messages split over multiple ext_proc messages since this is an existing behavior.

I think there's a clear path to removing the need for that in existing servers:

  1. Add support for GRPC mode in the ext_proc server. (Note that the ext_proc request already tells the server what body send mode the client is using, so during this migration, the server can tell how it's supposed to handle each ext_proc stream.)
  2. Change the configuration of all data planes to use GRPC mode.
  3. Once all data planes have been reconfigured, it will be safe to remove support for other body send modes from the ext_proc server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BodySendMode describes behavior of the state machine for exchanging body content with the server. You are describing two things simultaneously with this new value: what contains in the body payload and (possibly) new state transitions. The latter part is not described here. Does this value imply that the state machine is the same as STREAMED but with payload of gRPC messages? Or is it the same as FULL_DUPLEX_STREAMED? Or something new entirely? And why can;t I use BUFFERED state machine if I know I will only have unary gRPC messages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state machine is essentially the same as FULL_DUPLEX_STREAMED, just with the addition of the buffering and deframing of gRPC messages. In other words, the data plane will buffer only until it sees a complete gRPC message, at which point it will deframe the message and stream it to the ext_proc server. And the body responses sent back from the ext_proc server will each be an individual gRPC message as well.

I don't think it makes sense to use BUFFERED mode here. Note that the gRPC wire protocol is designed to handle streaming as a first-class citizen, and unary RPCs are just a special case: a unary RPC is a bidi stream that just happens to have a single request followed by a single result. Inside the HTTP/2 DATA frames, the request and response messages are still encoded with the same gRPC framing as in streaming cases. So using BUFFERED mode would still require the data plane to send the raw contents of the HTTP/2 DATA frames, and it would require the ext_proc server to handle the deframing, which is what we're trying to avoid here. (I don't think we want to add a BUFFERED_GRPC mode to make it handle the framing, and even if we did, the behavior would wind up being the same as the GRPC mode proposed here, so I think that would add complexity without actually providing any benefit.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markdroth can gRPC client just re-use FULL_DUPLEX_STREAMED mode then. The gRPC client implementation can do the buffering and deframing before sending the message to the gRPC ext_proc server, which is the gRPC client implementation detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the above comment, the proposal is that the only body send mode supported by proxyless gRPC is GRPC? If so, I take it that the expectation is that Envoy would not have to implement GRPC at all, right?

No, my expectation is that Envoy would also implement GRPC mode. Users that want to use ext_proc for gRPC traffic would want to migrate to this mode if they need to use proxyless gRPC.

I think it is a net loss to not have FULL_DUPLEX_STREAMED mode for proxyless gRPC. Part of the appeal of ext_proc is that extension authors can do things which aren't supported by the Envoy/proxyless gRPC stack and/or are too expensive or risky to do in either place. We recently had this discussion for compression and decompression, because the CPU cost of doing it in the proxy was a significant driver to moving it to ext_proc — at least then, it could be paid for by the ext_proc server owner rather than the proxy/client.

Take for example an ext_proc server which implements a new form of compression, let's call it Zsupreme. For cost reasons or whatever, the customer deploys an ext_proc server on GCE that sets the appropriate headers and performs Zsupreme compression on the body bytes for the proxy — AIUI, this is not possible using GRPC send mode.

If the goal is to implement compression on a per-message basis, which is the way compression is normally done for gRPC, then it should be totally possible in GRPC mode.

If the goal is to implement compression for the entire HTTP/2 stream, even compressing the gRPC framing within the HTTP/2 DATA frames, then that's not something gRPC will be able to support in the first place, regardless of how we structure the configuration. In gRPC, we deal with the gRPC framing in our transport layer, and the xDS HTTP filters are above that, so the ext_proc filter will see only the deframed gRPC messages. It's not possible for the filter to replace the raw HTTP/2 DATA frames, since it simply doesn't operate at that layer.

Given gRPC's architecture, I don't think there's actually anything that the ext_proc server could do that we could actually support that it won't be able to do with GRPC mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I will note that this is yet another reason why it makes sense for gRPC to not support FULL_DUPLEX_STREAMED, because that may fool ext_proc server authors into thinking that they can do things that gRPC simply can't support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking more and looking at other ext_proc implementations I think it would be preferable to use a separate field to indicate content and leave body sending mode as is. Proxyless gRPC would then use content_type: GRPC and ...body_mode: FULL_DUPLEX_STREAMED. Proxyless gRPC does need to implement other modes like BUFFERED or STREAMED.

The reason is we want to add support for other content types such as JSON-RPC and we want to be able to send them in all modes, most importantly in BUFFERED mode, since some ext_proc servers only support this mode and it will delay adoption if we have to force implementations to use FULL_DUPLEX_STREAMED to be able to process other content types.

If we put the GRPC content as the send mode then we have to add content-type for JSON-RPC anyway and then have to explain that it works in all cases except GRPC mode, which is not great.

Using content-type; GRPC and ..._body_mode: FULL_DUPLEX_STREAMED will work exactly the same way as GRPC body send mode without limiting implementation from using other send modes either for GRPC or for other protocols we are adding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the desire to support other content-aware modes like JSON-RPC, and I agree that that should be possible. So let's talk about the best way to do that.

It's not clear to me that introducing a separate content type setting actually provides a better story for doing that, because it's not actually an independent dimension from the body send mode. The content type would actually directly affect the behavior of the body send mode -- e.g., the exact buffering behavior of FULL_DUPLEX_STREAMED will be different depending on whether the content type is GRPC -- which means that both code and humans will have to look at both settings in order to understand the behavior.

Specifically:

  • In terms of the data plane implementation, the code implementing the body send mode in the data plane will need to understand the content type and alter its behavior accordingly, so I don't think this change actually makes anything simpler there: the content type setting will not reduce implementation complexity in Envoy.
  • The presence of the content type setting will make ext_proc server implementations more complex, because any ext_proc server will need to look at two different settings in order to decide how to interpret the body chunks that it's receiving and know what body chunks it needs to send back.
  • There are certain combinations of body send mode and content type that are inherently non-sensical and would not be supported (e.g., STREAMED and GRPC). If they are two separate knobs, then every data plane and every ext_proc server will need to decide how to handle these unsupported combinations.
  • For humans, it will be harder to understand the behavior if the behavior is dicatated by two different settings that interact with each other so deeply rather than just one. We will need to document the behavior for every possible combination of body send mode and content type: each body send mode will need to document which content types it supports and how that content type affects the state machine, and each content type will need to document which body send modes it supports. For example, we'll need to document things like "GRPC content_type is supported only in FULL_DUPLEX_STREAMED body send mode" and "in FULL_DUPLEX_STREAMED body send mode, if the content type is GRPC, each body chunk sent to or from the ext_proc server is a single deframed gRPC message". In contrast, if we have just one setting for body send mode, then it's much easier for users to understand the behavior of that mode.

Given that these two things are really just a single dimension, I think it would make things much easier for the configuration to reflect that. I would suggest instead defining new body send modes for each actual body send behavior, including those that are content-aware. In other words:

  • We can rename GRPC body send mode to FULL_DUPLEX_STREAMED_GRPC.
  • In the future, we can add BUFFERED_JSON_RPC and FULL_DUPLEX_STREAMED_JSON_RPC if needed.

I realize that this approach will wind up with a larger number of body send modes, but I think it will ultimately make it easier to understand and implement.

To be clear, I think we are going to ultimately support the same set of modes either way; the question here is only about which configuration structure winds up being easier to understand and which adds more complexity. If there's a strong argument for doing this as a separate content type knob, we can make that work, but it seems worse to me in terms of complexity and understandability.

Thoughts...?


As an aside, it's still not clear to me that it makes sense to support BUFFERED mode for gRPC traffic, for the following reasons:

  • This mode is useful only for unary RPCs, where there is only a single gRPC message on the stream. However, there is no indication in the wire protocol whether the RPC is unary or streaming, so the data plane would not even know that the RPC is streaming until after it sends the initial gRPC message to the ext_proc server. And it's not clear how the data plane should react at that point.
  • This mode does not send the body to the ext_proc server until after it receives the ext_proc response for the request headers, which means that it adds significant latency. This is something gRPC works hard to avoid, especially for unary RPCs: we send the client headers, message, and a half close in a single write on the TCP connection for performance reasons.

That having been said, we don't need to decide this right now, as long as we leave room for the possibility in the future. I think that if we rename the GRPC mode to FULL_DUPLEX_STREAMED_GRPC, it will leave an easy way to add support for a BUFFERED_GRPC mode in the future if we decide to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me that introducing a separate content type setting actually provides a better story for doing that, because it's not actually an independent dimension from the body send mode.

Why can it be independent? The bytes that ext_proc filter sends to the server is opaque to it, there is no specific meaning attached to them today. Today it is implied to be a portion (not even corresponding to H/2 DATA or H/1 chunks) of HTTP body because no other other protocols are used with ext_proc. This can be made explicit with additional content-type field.

We'd like to make ext_proc and other filters work generically with the protocols encapsulated within HTTP. I think we can make this work by adding ambient frame information to the request that ext_proc can use to generically support sending and receiving of encapsulated protocols. We have at least 2 in the pipeline: GRPC, JSON-RPC/MCP. WebSocket is also being asked for, but this is less definitive, so we want to give developers a flexible way to add support for encapsulated protocols without having to also modify ext_proc protocol.

}

// How to handle the request header. Default is "SEND".
Expand Down
62 changes: 56 additions & 6 deletions api/envoy/service/ext_proc/v3/external_processor.proto
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ message ProcessingRequest {
// the server must send back exactly one ProcessingResponse message.
// * If it is set to ``FULL_DUPLEX_STREAMED``, the server must follow the API defined
// for this mode to send the ProcessingResponse messages.
// [#next-free-field: 11]
// [#next-free-field: 12]
message ProcessingResponse {
// The response type that is sent by the server.
oneof response {
Expand Down Expand Up @@ -224,6 +224,19 @@ message ProcessingResponse {
// is set to true.
envoy.extensions.filters.http.ext_proc.v3.ProcessingMode mode_override = 9;

// [#not-implemented-hide:]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Envoy dataplane has this planned, and is still working on the details. The behavior might be slightly different from gRPC client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. I think we should handle this the same way for both FULL_DUPLEX_STREAMED and GRPC modes, so if you have a slightly different approach in mind, I'm happy to discuss changes here.

// Used only in ``FULL_DUPLEX_STREAMED`` and ``GRPC`` body send modes.
// Instructs the data plane to stop sending body data and to send a
// half-close on the ext_proc stream. The ext_proc server should then echo
// back all subsequent body contents as-is until it sees the client's
// half-close, at which point the ext_proc server can terminate the stream
// with an OK status. This provides a safe way for the ext_proc server
// to indicate that it does not need to see the rest of the stream;
// without this, the ext_proc server could not terminate the stream
// early, because it would wind up dropping any body contents that the
// client had already sent before it saw the ext_proc stream termination.
bool request_drain = 11;

// When ext_proc server receives a request message, in case it needs more
// time to process the message, it sends back a ProcessingResponse message
// with a new timeout value. When the data plane receives this response
Expand Down Expand Up @@ -268,11 +281,27 @@ message HttpHeaders {
message HttpBody {
Copy link
Contributor

@leonm1 leonm1 Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message is named HttpBody, but it does not contain any bytes or metadata related to HTTP in GRPC mode. It seems as though we should instead have a message like the following:

message GrpcMessage {

  google.type.Any payload = 1;

  // End of stream booleans  
}

This would also avoid the pitfall of a GRPC specific boolean in the HttpBody message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even in the new GRPC body send mode, the messages will still be sent as serialized bytes. They can't be sent as Any fields, because that requires encoding the protobuf type, and the ext_proc filter does not know the protobuf type to populate that. (This is true in both Envoy and gRPC.)

As I mentioned in another comment thread, I agree that the end_of_stream_without_message field is a little ugly, but I think the ideal solution would have been to simply use a wrapper type for the body field so that we could tell the difference between empty and unset. I don't think the two cases are really different enough to warrant a different message.

More generally, I do agree that there's a lot of tech debt in this API, and I wish it had been thought out better from the get-go rather than just evolving in place over time. But I don't think that's something we can fix at this point without giving up and creating a whole new API, which would fracture the ecosystem.

Copy link
Contributor

@paul-r-gall paul-r-gall Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an Any means that somewhere the Envoy ext_proc filter gRPC parsing code needs to be able to build the proto in question, which can only happen if that proto definition is linked into the Envoy. Rather than google.protobuf.Any it should probably just be bytes serialized_message = 1.

Copy link
Contributor

@paul-r-gall paul-r-gall Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry I responded to Matt without seeing Mark's response. Basically agree with Mark's comment.

// The contents of the body in the HTTP request/response. Note that in
// streaming mode multiple ``HttpBody`` messages may be sent.
//
// In ``GRPC`` body send mode, a separate ``HttpBody`` message will be
// sent for each message in the gRPC stream.
bytes body = 1;

// If ``true``, this will be the last ``HttpBody`` message that will be sent and no
// trailers will be sent for the current request/response.
bool end_of_stream = 2;

// This field is used in ``GRPC`` body send mode when ``end_of_stream`` is
// true and ``body`` is empty. Those values would normally indicate an
// empty message on the stream with the end-of-stream bit set.
// However, if the half-close happens after the last message on the
// stream was already sent, then this field will be true to indicate an
// end-of-stream with *no* message (as opposed to an empty message).
bool end_of_stream_without_message = 3;
Copy link
Contributor

@yanjunxiang-google yanjunxiang-google Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the protocol very complicated with Envoy and gRPC are doing different things. Can we have a separate API for gRPC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this is really doing is indicating the difference between unset and empty for the body field. I would have preferred to change the type of the body field to use a wrapper message similar to google.protobuf.StringValue, but that would be a breaking change.

I don't see why this field specifically adds a lot of complexity. I agree that there is a bit of work that needs to happen to support the GRPC body send mode in general, but this field specifically doesn't add very much complexity on top of that at all.

Note that there is no proposal here for Envoy and gRPC to do different things. This PR is about how to handle gRPC traffic, regardless of which data plane is handling that traffic. We would want Envoy to support this body send mode as well, since Envoy does often handle gRPC traffic.

I think that inventing a completely new ext_proc API for gRPC traffic doesn't make sense. The goal here is to leverage the existing ext_proc API, not build a competitor to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In gRPC mode is empty body with end_of_stream == true distinct from empty body with end_of_stream_without_message == true? If so, how?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In gRPC, if the client may send a half-close without sending a message, or long after the last message was sent. On the wire, that's represented as an HTTP/2 DATA frame that is empty but has the EOS bit set. But note that an empty HTTP/2 DATA frame is not the same as sending an empty gRPC message -- if there was another message sent, even if that message was empty, the empty message would still be framed within the HTTP/2 DATA frame.

In other words, there are 3 possible cases for a DATA frame with the EOS bit set:

  1. DATA frame with size 0 and EOS. This means that the client sent a half-close without sending a message.
  2. DATA frame with size 5 and EOS, where the contents are a gRPC message frame indicating a 0-length message. This means that the client has sent an empty message and EOS.
  3. DATA frame with size 15 and EOS, where the contents are a gRPC message frame indicating a 10-byte message. This means that the client has sent a non-empty message and EOS.

Note that on the wire between gRPC client and server, the difference between cases 1 and 2 is whether the gRPC framing is present. However, in GRPC send mode, we are not including the gRPC framing when we send the message to the ext_proc server; we're sending only the deframed messages. This means that we need a way to differentiate between cases 1 and 2, which is what this field provides.


// This field is used in ``GRPC`` body send mode to indicate whether
// the message is compressed. This will never be set to true by gRPC
// but may be set to true by a proxy like Envoy.
bool grpc_message_compressed = 4;
}

// This message is sent to the external server when the HTTP request and
Expand Down Expand Up @@ -331,6 +360,8 @@ message CommonResponse {
//
// In other words, this response makes it possible to turn an HTTP GET
// into a POST, PUT, or PATCH.
//
// Not supported if the body send mode is ``GRPC``.
CONTINUE_AND_REPLACE = 1;
}

Expand Down Expand Up @@ -415,15 +446,34 @@ message HeaderMutation {
repeated string remove_headers = 2;
}

// The body response message corresponding to FULL_DUPLEX_STREAMED body mode.
// The body response message corresponding to ``FULL_DUPLEX_STREAMED`` or ``GRPC`` body modes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each field in this message now has a comment above it to indicate that it means semantically different things based on a field contained in a different message.

This is saying to me that it belongs as a different message in a oneof at a higher level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they're really all that semantically different. The body field still contains the relevant part of the body, and the end_of_stream field always indicates EOS.

If I created another message here, it would look almost identical to this one, and I think that would cause more confusion than it would provide benefit at this point.

message StreamedBodyResponse {
// The body response chunk that will be passed to the upstream/downstream by the data plane.
// In ``FULL_DUPLEX_STREAMED`` body send mode, contains the body response chunk that will be
// passed to the upstream/downstream by the data plane. In ``GRPC`` body send mode, contains
// a serialized gRPC message to be passed to the upstream/downstream by the data plane.
bytes body = 1;

// The server sets this flag to true if it has received a body request with
// :ref:`end_of_stream <envoy_v3_api_field_service.ext_proc.v3.HttpBody.end_of_stream>` set to true,
// and this is the last chunk of body responses.
// Note that in ``GRPC`` body send mode, this allows the ext_proc
// server to tell the data plane to send a half close after a client
// message, which will result in discarding any other messages sent by
// the client application.
bool end_of_stream = 2;

// This field is used in ``GRPC`` body send mode when ``end_of_stream`` is
// true and ``body`` is empty. Those values would normally indicate an
// empty message on the stream with the end-of-stream bit set.
// However, if the half-close happens after the last message on the
// stream was already sent, then this field will be true to indicate an
// end-of-stream with *no* message (as opposed to an empty message).
bool end_of_stream_without_message = 3;

// This field is used in ``GRPC`` body send mode to indicate whether
// the message is compressed. This will never be set to true by gRPC
// but may be set to true by a proxy like Envoy.
bool grpc_message_compressed = 4;
}

// This message specifies the body mutation the server sends to the data plane.
Expand All @@ -433,19 +483,19 @@ message BodyMutation {
// The entire body to replace.
// Should only be used when the corresponding ``BodySendMode`` in the
// :ref:`processing_mode <envoy_v3_api_field_extensions.filters.http.ext_proc.v3.ExternalProcessor.processing_mode>`
// is not set to ``FULL_DUPLEX_STREAMED``.
// is not set to ``FULL_DUPLEX_STREAMED`` or ``GRPC``.
bytes body = 1;

// Clear the corresponding body chunk.
// Should only be used when the corresponding ``BodySendMode`` in the
// :ref:`processing_mode <envoy_v3_api_field_extensions.filters.http.ext_proc.v3.ExternalProcessor.processing_mode>`
// is not set to ``FULL_DUPLEX_STREAMED``.
// is not set to ``FULL_DUPLEX_STREAMED`` or ``GRPC``.
// Clear the corresponding body chunk.
bool clear_body = 2;

// Must be used when the corresponding ``BodySendMode`` in the
// :ref:`processing_mode <envoy_v3_api_field_extensions.filters.http.ext_proc.v3.ExternalProcessor.processing_mode>`
// is set to ``FULL_DUPLEX_STREAMED``.
// is set to ``FULL_DUPLEX_STREAMED`` or ``GRPC``.
StreamedBodyResponse streamed_response = 3
[(xds.annotations.v3.field_status).work_in_progress = true];
}
Expand Down