Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quiche: enable downstream HTTP3 in quic_protocol_integration_test #15424

Merged
merged 27 commits into from
Mar 24, 2021

Conversation

danzh2010
Copy link
Contributor

@danzh2010 danzh2010 commented Mar 11, 2021

Commit Message: Enable protocol integration test for HTTP3
Additional Description: some tests are disabled; Added quic client connection factory to HttpIntegrationTest to create quic client connection; Move the config setup logic from quic_http_integration_test to ConfigHelper and used it in the base class HttpIntegrationTest.

Add fixes in server and client streams to make some tests pass:
Check whether :authority is header has valid host name.
Set :protocol and :path field in request headers if the :method is CONNECT.
Quic trailers concatenates header values with the same key with '\0', split them into multiple headers.

Risk Level: low

Part of #12930 #2557

Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! So good to get more coverage for HTTP/3!

@@ -188,6 +193,10 @@ name: health_check

// Verifies behavior for https://github.com/envoyproxy/envoy/pull/11248
TEST_P(ProtocolIntegrationTest, AddBodyToRequestAndWaitForIt) {
// QUICHE can't guarantee headers and FIN to be delivered together, so
// headers-only request can't be detected at L7 filters.
EXCLUDE_DOWNSTREAM_HTTP3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine to leave this disabled for now but I think we can adapt the test filter to change a request with no body (headers, fin) to do (headers, fin, body). I've adapted a few filters in my test PR so might as well wait for that to land.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -425,13 +442,13 @@ TEST_P(DownstreamProtocolIntegrationTest, MissingHeadersLocalReply) {

// Regression test for https://github.com/envoyproxy/envoy/issues/10270
TEST_P(ProtocolIntegrationTest, LongHeaderValueWithSpaces) {
EXCLUDE_DOWNSTREAM_HTTP3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of the header tests should be fixed by 15381 but others need configurable header size to pass. We'll have to figure that one out at some point :-)

case Http::CodecClient::Type::HTTP2:
stat_name = "http2.dropped_headers_with_underscores";
break;
case Http::CodecClient::Type::HTTP3:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO to add a stat for H3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

test/integration/protocol_integration_test.cc Show resolved Hide resolved
test/integration/http_integration.cc Outdated Show resolved Hide resolved
test/integration/http_integration.cc Show resolved Hide resolved
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010
Copy link
Contributor Author

Signed-off-by: Dan Zhang <danzh@google.com>
@@ -61,7 +61,10 @@ std::unique_ptr<T> spdyHeaderBlockToEnvoyHeaders(const spdy::SpdyHeaderBlock& he
for (auto entry : header_block) {
// TODO(danzh): Avoid temporary strings and addCopy() with string_view.
std::string key(entry.first);
headers->addCopy(Http::LowerCaseString(key), entry.second);
std::vector<absl::string_view> values = absl::StrSplit(entry.second, '\0');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised to see splitting on \0 and so I went looking around and found the definition in spdy/core/spdy_header_block.h which confirms it, but not all the time:

  // If a header with the key is already present, then append the value to the
  // existing header value, NUL ("\0") separated unless the key is cookie, in
  // which case the separator is "; ".
  // If there is no such key, a new header with the key and value is added.
  void AppendValueOrAddHeader(const absl::string_view key,
                              const absl::string_view value);

I've never worked with that library before but I wonder if it supplies some sort of split API we should be using rather than calling StrSplit directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Josh, is your concern that the header may be split by \0 or ;? I think the base Enovy classes can handle ; (e.g. parseCookieValue) but would basically fail to recognize quic-multi-valued-headers with \0 because HTTP/2 and below don't support that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mostly just thinking that the \0 delim looks like an implementation detail of spdy_header_block.h and the detail should ideally be abstracted away at that level.

And yes I was also wondering specifically about what the behavior would be with cookies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, SpdyHeaderBlock doesn't have such API to split the \0 delimited header values. Even quic qpack code has to write its own logic to handle this special case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments here might help at least clear up the mysticism, if it's really impossible to create an abstraction boundary around this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for more comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

for (auto& listener : *bootstrap_.mutable_static_resources()->mutable_listeners()) {
if (listener.udp_listener_config().listener_config().typed_config().type_url() ==
"type.googleapis.com/envoy.config.listener.v3.QuicProtocolOptions") {
auto* filter_chain = listener.mutable_filter_chains(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be checking that filter_chains is non-empty?

Copy link
Contributor Author

@danzh2010 danzh2010 Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a test util function whose input is the config template defined in the same file. But I added ASSERT to help future debugging.

ConnectionCallbacks(Event::Dispatcher& dispatcher) : dispatcher_(dispatcher) {}

// Network::ConnectionCallbacks
void onEvent(Network::ConnectionEvent event) override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE questions asked over DM about integration_admin_test, skimming over your PR it looks like this code might be interesting for some breakpoints.

I don't have a clear picture of why it's needed, so maybe some class comments would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment

NiceMock<Stats::MockIsolatedStatsStore> mock_stats_store;
NiceMock<Random::MockRandomGenerator> random;
Event::GlobalTimeSystem time_system;
NiceMock<Random::MockRandomGenerator> random_generator;
Api::Impl api(Thread::threadFactoryForTest(), mock_stats_store, time_system,
Filesystem::fileSystemForTest(), random_generator);
Event::DispatcherPtr dispatcher(api.allocateDispatcher("test_thread"));
ConnectionCallbacks connection_callbacks(*dispatcher);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function has gotten a lot longer and harder to follow due to the handling of h3 in it. Can we think of ways to refactor this so that the function structure remains more like it used to be?

E.g. one option would be to just add an early dispatch:

  if (h3 mode) {
    return makeSingleH3Request(...);
  }

and then factor out the common blocks from the h3 and non-h3 impls into some helpers.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for suggesting the refactoring! Surprisingly, this fix the TSAN issue I asked earlier. I found that createQuicUpstreamTransportSocketFactory() which was also called under h2 and h1 earlier somehow breaks the stats histogram Expectation. But I still don't understand how could a client util function lead to server stats change. This function calls a registered factory to create a quic upstream transport socket factory for the test client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the reason is that went looking at the code in its previous state, it was very heard to discern whether the behavior was the same in H2 with/without your change. I suspect it wasn't, and that was what was affecting the integration test tsan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still interested in why calling into a registered factory on client side could change server stats. This helper function is pretty much self contained. Do you have any idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd guess this is inducing the client to change the way it communicates with the server in a stat-observable way, but I'd have to dig in with the debugger to learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reproduced the same failure by adding a line to print trace in a clean branch https://github.com/envoyproxy/envoy/compare/main...danzh2010:admin_test?expand=1. Is there some existing race in that test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Would a sleep there also cause the test to fail? How about printing a bunch of text to stderr?

That trace macro doesn't look like it ought to wake up anything in stats to change behavior, so I suspect a sleep or simple I/O might do the same thing.

Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@@ -61,7 +61,10 @@ std::unique_ptr<T> spdyHeaderBlockToEnvoyHeaders(const spdy::SpdyHeaderBlock& he
for (auto entry : header_block) {
// TODO(danzh): Avoid temporary strings and addCopy() with string_view.
std::string key(entry.first);
headers->addCopy(Http::LowerCaseString(key), entry.second);
std::vector<absl::string_view> values = absl::StrSplit(entry.second, '\0');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Josh, is your concern that the header may be split by \0 or ;? I think the base Enovy classes can handle ; (e.g. parseCookieValue) but would basically fail to recognize quic-multi-valued-headers with \0 because HTTP/2 and below don't support that.

auto envoy_headers = spdyHeaderBlockToEnvoyHeaders<Http::RequestHeaderMapImpl>(headers_block);
EXPECT_EQ(headers_block.size(), envoy_headers->size());
EXPECT_EQ(headers_block.size() + 1u, envoy_headers->size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment why the +1 please, since it's not obvious about the \0 split

host_description, downstream_protocol_);
if (downstream_protocol_ == Http::CodecClient::Type::HTTP3 && codec->disconnected()) {
// Connection may get closed during version negotiation or handshake.
ENVOY_LOG(error, "Fail to connect to server with error: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a test failure rather than just a log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quic_http_integration_test has tests for handshake failure and expect it to continue at this point. In other tests, sending request with a disconnected client will naturally fail.

And in the future, we probably want to add version negotiation logic to client, either prod or test codec, to retry upon INVALID_VERSION, so this shouldn't be treated as test failure. I moved the TODO and comment from quic_http_integration_test to this place.

if (downstream_protocol_ != Http::CodecClient::Type::HTTP3) {
return BaseIntegrationTest::initialize();
}
NiceMock<Server::Configuration::MockTransportSocketFactoryContext> mock_factory_ctx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's factor this into a helper, and do you mind adding a comment why parts have to be done before and after initialize()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments.

I added mock_factory_ctx into createQuicUpstreamTransportSocketFactory() which is already a helper function.

@@ -59,6 +59,11 @@ void setDoNotValidateRouteConfig(
return; \
}

#define EXCLUDE_DOWNSTREAM_HTTP3 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind adding a TODO to address all of these?

common_context->mutable_validation_context()
->add_hidden_envoy_deprecated_verify_subject_alt_name("spiffe://lyft.com/backend-team");
if (!options.san_.empty()) {
// common_context->mutable_validation_context()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah if it passes let's just remove it.

const std::string& method, const std::string& url,
const std::string& body, Http::CodecClient::Type type,
const std::string& host, const std::string& content_type) {
struct ConnectionCallbacks : public Network::ConnectionCallbacks {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class?
and personal preference for TestConnectionCallbacks or anything else that doesn't have the same name in a different namespace :-)

Let's also add a class comment on what the purpose of the class is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// QUIC connection may fail of INVALID_VERSION if both this client doesn't support any of
// the versions the server advertised before handshake established. In this case the
// connection is closed locally and this is in a blocking event loop.
dispatcher_.exit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear on why we need this class.
Does TLS not test connection failures?
Can you snag me on IM and point me at where we're hanging and perhaps I can suggest an alternate fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ssl connection starts handshake in transport socket after TCP connection's connect() is called, but EnvoyQuicClientSession starts handshake while the CodecClient calls connect() in constructor. So I guess Ssl connection pauses in transport socket, but QUIC connection pauses in connect().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get a TODO to clean this up? I think I'd prefer we rework the test framework to handle this failure mode than have this test class. I suspect there's a cleaner way but I don't want to block the PR on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO added.

Signed-off-by: Dan Zhang <danzh@google.com>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the two requested nits.
@mattklein123 up for a quick sweep as well?

// QUIC connection may fail of INVALID_VERSION if both this client doesn't support any of
// the versions the server advertised before handshake established. In this case the
// connection is closed locally and this is in a blocking event loop.
dispatcher_.exit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get a TODO to clean this up? I think I'd prefer we rework the test framework to handle this failure mode than have this test class. I suspect there's a cleaner way but I don't want to block the PR on it.

Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #15424 (comment) was created by @danzh2010.

see: more, trace.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exciting to see so much active progress on H3! Just some small comments on source.

/wait

@@ -42,6 +42,9 @@ ActiveQuicListener::ActiveQuicListener(
&listener_config),
dispatcher_(dispatcher), version_manager_(quic::CurrentSupportedVersions()),
kernel_worker_routing_(kernel_worker_routing) {
// This flag fix a QUICHE issue which may crash Envoy during connection close.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a TODO and/or link to a follow up issue? Do we plan on removing this later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quiche flags are like our runtime guards only they start false - it'll move to true by default and get removed over some number of import cycles.

@@ -29,6 +29,8 @@ QuicClientConnectionFactoryImpl::createQuicNetworkConnection(
Http::PersistentQuicInfo& info, Event::Dispatcher& dispatcher,
Network::Address::InstanceConstSharedPtr server_addr,
Network::Address::InstanceConstSharedPtr local_addr) {
// This flag fix a QUICHE issue which may crash Envoy during connection close.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO/link?

@@ -61,7 +61,10 @@ std::unique_ptr<T> spdyHeaderBlockToEnvoyHeaders(const spdy::SpdyHeaderBlock& he
for (auto entry : header_block) {
// TODO(danzh): Avoid temporary strings and addCopy() with string_view.
std::string key(entry.first);
headers->addCopy(Http::LowerCaseString(key), entry.second);
std::vector<absl::string_view> values = absl::StrSplit(entry.second, '\0');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for more comments

alyssawilk
alyssawilk previously approved these changes Mar 23, 2021
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@alyssawilk alyssawilk merged commit e5ab06f into envoyproxy:main Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants