-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use local_info.node() instead of bootstrap.node() whenever possible #4120
Conversation
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Test please? |
@mattklein123 yes, sorry missing some commits. I'm on it. |
WDYT about adding an |
@mattklein123 I feel no confidence on the change for initializing And I haven't found a way to check the effect of those CLI flags directly. But, since it is being forced for xDS, via |
@ramaraochavali yeah, I'm thinking the same, what do you think about 6c263d3? Do you have any suggestions to add similar checks in other places? |
@@ -211,7 +211,7 @@ ClusterManagerImpl::ClusterManagerImpl(const envoy::config::bootstrap::v2::Boots | |||
// Now setup ADS if needed, this might rely on a primary cluster. | |||
if (bootstrap.dynamic_resources().has_ads_config()) { | |||
ads_mux_.reset(new Config::GrpcMuxImpl( | |||
bootstrap.node(), | |||
local_info.node(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add method similar to Config::Utility::checkLocalInfo("ads", local_info)
https://github.com/envoyproxy/envoy/blob/master/source/common/config/grpc_mux_impl.cc#L18
and validate that node has right info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can. However currently, if we failed to supply the info, it firstly throws an error for "cds", which I think serves the same purpose.
[2018-08-13 17:32:35.821][2350767][critical][main] source/server/server.cc:78] error initializing configuration '/envoy/config-local.yaml': cds: setting --service-cluster and --service-node is required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ramaraochavali after an experiment, I think it is tempting to have Config::Utility::checkLocalInfo
inside the GrpcMuxImpl
's constructor (and testing it is easier, yeay!). However, since we want to pass in the local_info
here, the implication is quite large (but mostly "replacing" node
with local_info
). I have the changeset ready but want to gather opinions about it.
cc. @htuch @mattklein123.
@@ -132,6 +132,10 @@ class AdsIntegrationTest : public AdsIntegrationBaseTest, | |||
envoy::api::v2::DiscoveryRequest discovery_request; | |||
VERIFY_ASSERTION(ads_stream_->waitForGrpcMessage(*dispatcher_, discovery_request)); | |||
|
|||
ASSERT(discovery_request.has_node()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think while these ASSERTs
are good, they would just validate existing tests right? Can we add a test in grpc_mux_impl_test
that validates bad Node
passed to it along the lines of this test
envoy/test/server/lds_api_test.cc
Line 203 in 0228180
TEST_F(LdsApiTest, BadLocalInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. Will try to read and add a test.
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
@@ -164,6 +164,9 @@ class LoadStatsIntegrationTest : public HttpIntegrationTest, | |||
return; | |||
} else if (loadstats_request.cluster_stats_size() == 0) { | |||
loadstats_request.CopyFrom(local_loadstats_request); | |||
ASSERT(loadstats_request.has_node()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want ASSERT_TRUE
to use the asserts from gtest.
@@ -301,7 +301,7 @@ ClusterManagerImpl::ClusterManagerImpl(const envoy::config::bootstrap::v2::Boots | |||
if (cm_config.has_load_stats_config()) { | |||
const auto& load_stats_config = cm_config.load_stats_config(); | |||
load_stats_reporter_.reset( | |||
new LoadStatsReporter(bootstrap.node(), *this, stats, | |||
new LoadStatsReporter(local_info.node(), *this, stats, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option, not saying you have to do this, is to treat the CLI flags as a bootstrap override and override the bootstrap proto early in server config, so that local_info
and bootstrap
are consistent. Or, maybe nuke node from bootstrap once you've loaded up local_info. Ideally we have a single source-of-truth, and it's hard to confuse with other potential sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, it would be great to have a single source of truth here. From an interface perspective it seems better to have LocalInfo be the source of truth but I don't feel that strongly about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use local_info
.
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
@dio Thanks for this. I like the validation in the constructor. Tests looks much cleaner. LGTM |
*Protobuf::DescriptorPool::generated_pool()->FindMethodByName( | ||
"envoy.service.discovery.v2.AggregatedDiscoveryService.StreamAggregatedResources"), | ||
random_, time_source_), | ||
EnvoyException, "ads: setting --service-cluster and --service-node is required"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated: these error messages are misleading nowadays, since you can specify this same information in the bootstrap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some difficulties with crafting related test before, so I guess I need to read more on bootstrap initialization steps (server init).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think we should revisit the thrown error message from Config::Utility::checkLocalInfo(_, _);
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the message, not sure about its clarity and correctness.
EXPECT_CALL(dispatcher_, createTimer_(_)).WillOnce(Invoke([this](Event::TimerCb timer_cb) { | ||
timer_cb_ = timer_cb; | ||
timer_ = new Event::MockTimer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the newly introduced tests do not call setup()
, and timer_
was previously initialized as new Event::MockTimer()
at constructor, this makes sure no leak. Caught by LeakSanitizer
in this build: https://circleci.com/gh/envoyproxy/envoy/84216.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yeah if we go with the other direction it will be reverted.
@@ -211,7 +211,7 @@ ClusterManagerImpl::ClusterManagerImpl(const envoy::config::bootstrap::v2::Boots | |||
// Now setup ADS if needed, this might rely on a primary cluster. | |||
if (bootstrap.dynamic_resources().has_ads_config()) { | |||
ads_mux_.reset(new Config::GrpcMuxImpl( | |||
bootstrap.node(), | |||
local_info, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mixed opinions on this change. On the one hand, we make it more full proof, by forcing the LocalInfo
type. On the other, we are now passing a strict superset of what we were passing before, and the new information is not used. My main thoughts on making LocalInfo
canonical would be to just delete the node info from bootstrap very early in server init. But, this approach also works..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting POV, will try to explore that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessarily the correct PoV; just sharing where my thinking was originally :)
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let's ship this as it's a definite improvement, we can bikeshed about internals later on.
* origin/master: (34 commits) fuzz: fixes oss-fuzz: 9895 (envoyproxy#4189) docs: added developer docs for pprof/tcmalloc testing (envoyproxy#4194) fix sometimes returns response body to HEAD requests (envoyproxy#3985) admin: fix config dump order (envoyproxy#4197) thrift: allow translation between downstream and upstream protocols (envoyproxy#4136) Use local_info.node() instead of bootstrap.node() whenever possible (envoyproxy#4120) upstream: allow extension protocol options to be used with http filters (envoyproxy#4165) [thrift_proxy] Replace local HeaderMap with Http::HeaderMap[Impl] (envoyproxy#4169) docs: improve gRPC-JSON filter doc (envoyproxy#4184) stats: refactoring MetricImpl without strings (envoyproxy#4190) fuzz: coverage profile generation instructions. (envoyproxy#4193) upstream: rebuild cluster when health check config is changed (envoyproxy#4075) build: use clang-6.0. (envoyproxy#4168) thrift_proxy: introduce header transport (envoyproxy#4082) tcp: allow connection pool callers to store protocol state (envoyproxy#4131) configs: match latest API changes (envoyproxy#4185) Fix wrong mock function name. (envoyproxy#4187) Bump yaml-cpp so it builds with Visual Studio 15.8 (envoyproxy#4182) Converting envoy configs to V2 (envoyproxy#2957) Add timestamp to HealthCheckEvent definition (envoyproxy#4119) ... Signed-off-by: Snow Pettersen <snowp@squareup.com>
Description:
Use
local_info.node()
instead ofbootstrap.node()
whenever possible.Risk Level: Low
Testing: Existing
Docs Changes: N/A
Release Notes: N/A
Fixes #4085
Signed-off-by: Dhi Aurrahman dio@tetrate.io