-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quiche: enable downstream HTTP3 in quic_protocol_integration_test #15424
quiche: enable downstream HTTP3 in quic_protocol_integration_test #15424
Conversation
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! So good to get more coverage for HTTP/3!
@@ -188,6 +193,10 @@ name: health_check | |||
|
|||
// Verifies behavior for https://github.com/envoyproxy/envoy/pull/11248 | |||
TEST_P(ProtocolIntegrationTest, AddBodyToRequestAndWaitForIt) { | |||
// QUICHE can't guarantee headers and FIN to be delivered together, so | |||
// headers-only request can't be detected at L7 filters. | |||
EXCLUDE_DOWNSTREAM_HTTP3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine to leave this disabled for now but I think we can adapt the test filter to change a request with no body (headers, fin) to do (headers, fin, body). I've adapted a few filters in my test PR so might as well wait for that to land.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -425,13 +442,13 @@ TEST_P(DownstreamProtocolIntegrationTest, MissingHeadersLocalReply) { | |||
|
|||
// Regression test for https://github.com/envoyproxy/envoy/issues/10270 | |||
TEST_P(ProtocolIntegrationTest, LongHeaderValueWithSpaces) { | |||
EXCLUDE_DOWNSTREAM_HTTP3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of the header tests should be fixed by 15381 but others need configurable header size to pass. We'll have to figure that one out at some point :-)
case Http::CodecClient::Type::HTTP2: | ||
stat_name = "http2.dropped_headers_with_underscores"; | ||
break; | ||
case Http::CodecClient::Type::HTTP3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO to add a stat for H3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
The TSAN keeps failing at integration_admin_test: https://dev.azure.com/cncf/envoy/_build/results?buildId=69583&view=logs&j=d1f76054-8f79-554b-6f4a-11d6a63b8e00&t=266e17e3-d213-54b5-deef-0dcee01da137&l=15243. Seems unrelated? |
Signed-off-by: Dan Zhang <danzh@google.com>
@@ -61,7 +61,10 @@ std::unique_ptr<T> spdyHeaderBlockToEnvoyHeaders(const spdy::SpdyHeaderBlock& he | |||
for (auto entry : header_block) { | |||
// TODO(danzh): Avoid temporary strings and addCopy() with string_view. | |||
std::string key(entry.first); | |||
headers->addCopy(Http::LowerCaseString(key), entry.second); | |||
std::vector<absl::string_view> values = absl::StrSplit(entry.second, '\0'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was surprised to see splitting on \0
and so I went looking around and found the definition in spdy/core/spdy_header_block.h which confirms it, but not all the time:
// If a header with the key is already present, then append the value to the
// existing header value, NUL ("\0") separated unless the key is cookie, in
// which case the separator is "; ".
// If there is no such key, a new header with the key and value is added.
void AppendValueOrAddHeader(const absl::string_view key,
const absl::string_view value);
I've never worked with that library before but I wonder if it supplies some sort of split API we should be using rather than calling StrSplit directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Josh, is your concern that the header may be split by \0 or ;? I think the base Enovy classes can handle ; (e.g. parseCookieValue) but would basically fail to recognize quic-multi-valued-headers with \0 because HTTP/2 and below don't support that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mostly just thinking that the \0 delim looks like an implementation detail of spdy_header_block.h and the detail should ideally be abstracted away at that level.
And yes I was also wondering specifically about what the behavior would be with cookies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, SpdyHeaderBlock doesn't have such API to split the \0
delimited header values. Even quic qpack code has to write its own logic to handle this special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments here might help at least clear up the mysticism, if it's really impossible to create an abstraction boundary around this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for more comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
for (auto& listener : *bootstrap_.mutable_static_resources()->mutable_listeners()) { | ||
if (listener.udp_listener_config().listener_config().typed_config().type_url() == | ||
"type.googleapis.com/envoy.config.listener.v3.QuicProtocolOptions") { | ||
auto* filter_chain = listener.mutable_filter_chains(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we be checking that filter_chains is non-empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a test util function whose input is the config template defined in the same file. But I added ASSERT to help future debugging.
ConnectionCallbacks(Event::Dispatcher& dispatcher) : dispatcher_(dispatcher) {} | ||
|
||
// Network::ConnectionCallbacks | ||
void onEvent(Network::ConnectionEvent event) override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE questions asked over DM about integration_admin_test, skimming over your PR it looks like this code might be interesting for some breakpoints.
I don't have a clear picture of why it's needed, so maybe some class comments would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment
test/integration/utility.cc
Outdated
NiceMock<Stats::MockIsolatedStatsStore> mock_stats_store; | ||
NiceMock<Random::MockRandomGenerator> random; | ||
Event::GlobalTimeSystem time_system; | ||
NiceMock<Random::MockRandomGenerator> random_generator; | ||
Api::Impl api(Thread::threadFactoryForTest(), mock_stats_store, time_system, | ||
Filesystem::fileSystemForTest(), random_generator); | ||
Event::DispatcherPtr dispatcher(api.allocateDispatcher("test_thread")); | ||
ConnectionCallbacks connection_callbacks(*dispatcher); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function has gotten a lot longer and harder to follow due to the handling of h3 in it. Can we think of ways to refactor this so that the function structure remains more like it used to be?
E.g. one option would be to just add an early dispatch:
if (h3 mode) {
return makeSingleH3Request(...);
}
and then factor out the common blocks from the h3 and non-h3 impls into some helpers.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for suggesting the refactoring! Surprisingly, this fix the TSAN issue I asked earlier. I found that createQuicUpstreamTransportSocketFactory() which was also called under h2 and h1 earlier somehow breaks the stats histogram Expectation. But I still don't understand how could a client util function lead to server stats change. This function calls a registered factory to create a quic upstream transport socket factory for the test client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the reason is that went looking at the code in its previous state, it was very heard to discern whether the behavior was the same in H2 with/without your change. I suspect it wasn't, and that was what was affecting the integration test tsan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still interested in why calling into a registered factory on client side could change server stats. This helper function is pretty much self contained. Do you have any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd guess this is inducing the client to change the way it communicates with the server in a stat-observable way, but I'd have to dig in with the debugger to learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reproduced the same failure by adding a line to print trace in a clean branch https://github.com/envoyproxy/envoy/compare/main...danzh2010:admin_test?expand=1. Is there some existing race in that test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Would a sleep there also cause the test to fail? How about printing a bunch of text to stderr?
That trace macro doesn't look like it ought to wake up anything in stats to change behavior, so I suspect a sleep or simple I/O might do the same thing.
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@@ -61,7 +61,10 @@ std::unique_ptr<T> spdyHeaderBlockToEnvoyHeaders(const spdy::SpdyHeaderBlock& he | |||
for (auto entry : header_block) { | |||
// TODO(danzh): Avoid temporary strings and addCopy() with string_view. | |||
std::string key(entry.first); | |||
headers->addCopy(Http::LowerCaseString(key), entry.second); | |||
std::vector<absl::string_view> values = absl::StrSplit(entry.second, '\0'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Josh, is your concern that the header may be split by \0 or ;? I think the base Enovy classes can handle ; (e.g. parseCookieValue) but would basically fail to recognize quic-multi-valued-headers with \0 because HTTP/2 and below don't support that.
auto envoy_headers = spdyHeaderBlockToEnvoyHeaders<Http::RequestHeaderMapImpl>(headers_block); | ||
EXPECT_EQ(headers_block.size(), envoy_headers->size()); | ||
EXPECT_EQ(headers_block.size() + 1u, envoy_headers->size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment why the +1 please, since it's not obvious about the \0 split
host_description, downstream_protocol_); | ||
if (downstream_protocol_ == Http::CodecClient::Type::HTTP3 && codec->disconnected()) { | ||
// Connection may get closed during version negotiation or handshake. | ||
ENVOY_LOG(error, "Fail to connect to server with error: {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a test failure rather than just a log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quic_http_integration_test has tests for handshake failure and expect it to continue at this point. In other tests, sending request with a disconnected client will naturally fail.
And in the future, we probably want to add version negotiation logic to client, either prod or test codec, to retry upon INVALID_VERSION, so this shouldn't be treated as test failure. I moved the TODO and comment from quic_http_integration_test to this place.
test/integration/http_integration.cc
Outdated
if (downstream_protocol_ != Http::CodecClient::Type::HTTP3) { | ||
return BaseIntegrationTest::initialize(); | ||
} | ||
NiceMock<Server::Configuration::MockTransportSocketFactoryContext> mock_factory_ctx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's factor this into a helper, and do you mind adding a comment why parts have to be done before and after initialize()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments.
I added mock_factory_ctx into createQuicUpstreamTransportSocketFactory() which is already a helper function.
@@ -59,6 +59,11 @@ void setDoNotValidateRouteConfig( | |||
return; \ | |||
} | |||
|
|||
#define EXCLUDE_DOWNSTREAM_HTTP3 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mind adding a TODO to address all of these?
test/integration/ssl_utility.cc
Outdated
common_context->mutable_validation_context() | ||
->add_hidden_envoy_deprecated_verify_subject_alt_name("spiffe://lyft.com/backend-team"); | ||
if (!options.san_.empty()) { | ||
// common_context->mutable_validation_context() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah if it passes let's just remove it.
test/integration/utility.cc
Outdated
const std::string& method, const std::string& url, | ||
const std::string& body, Http::CodecClient::Type type, | ||
const std::string& host, const std::string& content_type) { | ||
struct ConnectionCallbacks : public Network::ConnectionCallbacks { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class?
and personal preference for TestConnectionCallbacks or anything else that doesn't have the same name in a different namespace :-)
Let's also add a class comment on what the purpose of the class is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// QUIC connection may fail of INVALID_VERSION if both this client doesn't support any of | ||
// the versions the server advertised before handshake established. In this case the | ||
// connection is closed locally and this is in a blocking event loop. | ||
dispatcher_.exit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unclear on why we need this class.
Does TLS not test connection failures?
Can you snag me on IM and point me at where we're hanging and perhaps I can suggest an alternate fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ssl connection starts handshake in transport socket after TCP connection's connect() is called, but EnvoyQuicClientSession starts handshake while the CodecClient calls connect() in constructor. So I guess Ssl connection pauses in transport socket, but QUIC connection pauses in connect().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we get a TODO to clean this up? I think I'd prefer we rework the test framework to handle this failure mode than have this test class. I suspect there's a cleaner way but I don't want to block the PR on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO added.
Signed-off-by: Dan Zhang <danzh@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo the two requested nits.
@mattklein123 up for a quick sweep as well?
// QUIC connection may fail of INVALID_VERSION if both this client doesn't support any of | ||
// the versions the server advertised before handshake established. In this case the | ||
// connection is closed locally and this is in a blocking event loop. | ||
dispatcher_.exit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we get a TODO to clean this up? I think I'd prefer we rework the test framework to handle this failure mode than have this test class. I suspect there's a cleaner way but I don't want to block the PR on it.
Signed-off-by: Dan Zhang <danzh@google.com>
/retest |
Retrying Azure Pipelines: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting to see so much active progress on H3! Just some small comments on source.
/wait
@@ -42,6 +42,9 @@ ActiveQuicListener::ActiveQuicListener( | |||
&listener_config), | |||
dispatcher_(dispatcher), version_manager_(quic::CurrentSupportedVersions()), | |||
kernel_worker_routing_(kernel_worker_routing) { | |||
// This flag fix a QUICHE issue which may crash Envoy during connection close. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a TODO and/or link to a follow up issue? Do we plan on removing this later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quiche flags are like our runtime guards only they start false - it'll move to true by default and get removed over some number of import cycles.
@@ -29,6 +29,8 @@ QuicClientConnectionFactoryImpl::createQuicNetworkConnection( | |||
Http::PersistentQuicInfo& info, Event::Dispatcher& dispatcher, | |||
Network::Address::InstanceConstSharedPtr server_addr, | |||
Network::Address::InstanceConstSharedPtr local_addr) { | |||
// This flag fix a QUICHE issue which may crash Envoy during connection close. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO/link?
@@ -61,7 +61,10 @@ std::unique_ptr<T> spdyHeaderBlockToEnvoyHeaders(const spdy::SpdyHeaderBlock& he | |||
for (auto entry : header_block) { | |||
// TODO(danzh): Avoid temporary strings and addCopy() with string_view. | |||
std::string key(entry.first); | |||
headers->addCopy(Http::LowerCaseString(key), entry.second); | |||
std::vector<absl::string_view> values = absl::StrSplit(entry.second, '\0'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for more comments
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Commit Message: Enable protocol integration test for HTTP3
Additional Description: some tests are disabled; Added quic client connection factory to HttpIntegrationTest to create quic client connection; Move the config setup logic from quic_http_integration_test to ConfigHelper and used it in the base class HttpIntegrationTest.
Add fixes in server and client streams to make some tests pass:
Check whether
:authority
is header has valid host name.Set
:protocol
and:path
field in request headers if the:method
is CONNECT.Quic trailers concatenates header values with the same key with '\0', split them into multiple headers.
Risk Level: low
Part of #12930 #2557