-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
conn_pool: making TCP upstreams pluggable #13548
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice refactor, thanks for this work to reduce differences between various upstream connection factories.
Please look at CI since there's some doc generation and spellcheck errors.
hdrs = [ | ||
"config.h", | ||
], | ||
security_posture = "robust_to_untrusted_downstream", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this extension interact with upstreams? Worth a comment explaining the interaction (if there's any) or saying that there should be no upstream interaction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Enovy in general is considered not robust to untrusted upstreams.
@htuch is this documented somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the work @yanavlasov is doing in #12278, this is about to change shortly. Do you think you think there are untrusted upstream issues in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
htuch: I think part of the issue is that there's currently no way to indicate that an extension is robust to untrusted downstreams and agnostic to upstream input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's agnostic then it's robust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happy to tag this whatever you folks decide - it's no different from the current TCP proxy code robustness which as I understand today is considered robust to downstream and not to upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanavlasov do you want to make a call on this one based on upstream hardening progress?
@@ -1029,6 +1030,29 @@ TEST_F(TcpProxyTest, DEPRECATED_FEATURE_TEST(HalfCloseProxy)) { | |||
upstream_callbacks_->onEvent(Network::ConnectionEvent::RemoteClose); | |||
} | |||
|
|||
// Test with an explicitly configured upstream. | |||
TEST_F(TcpProxyTest, DEPRECATED_FEATURE_TEST(ExplicitFactory)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused. GenericConnectionPoolProto is the proto that you added in this PR, but it seems that this test is for a feature that is being deprecated.
include/envoy/tcp/upstream.h
Outdated
* @param cm the cluster manager to get the connection pool from | ||
* @param hostname the hostname to use if doing connect tunneling, absl::nullopt otherwise. | ||
* @param context the load balancing context for this connection. | ||
* @param upstream_callbacks the callbacks to provide to the connection if succesfully created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
source/common/tcp_proxy/tcp_proxy.cc
Outdated
factory = &Envoy::Config::Utility::getAndCheckFactory<GenericConnPoolFactory>( | ||
cluster.info()->upstreamConfig().value()); | ||
} else { | ||
auto* cluster = cluster_manager_.get(cluster_name); | ||
if (!cluster) { | ||
return false; | ||
} | ||
// TODO(snowp): Ideally we should prevent this from being configured, but that's tricky to get | ||
// right since whether a cluster is invalid depends on both the tcp_proxy config + cluster | ||
// config. | ||
if ((cluster->info()->features() & Upstream::ClusterInfo::Features::HTTP2) == 0) { | ||
ENVOY_LOG(error, "Attempted to tunnel over HTTP/1.1, this is not supported. Set " | ||
"http2_protocol_options on the cluster."); | ||
return false; | ||
} | ||
|
||
generic_conn_pool_ = std::make_unique<HttpConnPool>(cluster_name, cluster_manager_, this, | ||
config_->tunnelingConfig()->hostname(), | ||
*upstream_callbacks_); | ||
if (generic_conn_pool_->valid()) { | ||
generic_conn_pool_->newStream(this); | ||
return true; | ||
} else { | ||
generic_conn_pool_.reset(); | ||
} | ||
factory = &Envoy::Config::Utility::getAndCheckFactoryByName<GenericConnPoolFactory>( | ||
"envoy.filters.connection_pools.tcp.generic"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getAndCheckFactoryByName may throw exception on invalid type_config
Is such "non-exist" factory config validated by XDS already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. This does look like data plane code with no obvious try/catch blocks in sight. The most concerning one is the getAndCheckFactory on line 447 above which is a function of config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's safe to lookup by name providing you can guarantee that the filter will be there; it will throw otherwise. The config thing becomes an issue only when you are actually loading config.
I wonder if we can make this more robust going forward by having the factory functions ASSERT that they are in a control plane context, using some main thread singleton. There's probably static annotations etc. we could use if we wanted to get fancy, but I can see how to build this dynamic check.
Also, Quic has this same pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling the factory lookup functions in a non control plane setting is not legal. I urge we invest in adding the appropriate ASSERTs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored to make it non-fatal and added a test. PTAL!
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some remaining nits. Looks good otherwise.
@@ -230,6 +230,33 @@ class Utility { | |||
return *factory; | |||
} | |||
|
|||
/** | |||
* Get a Factory from the registry with a particular name or return nullptr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: worth adding a @return statement to the comment?
source/common/tcp_proxy/tcp_proxy.cc
Outdated
@@ -442,14 +442,19 @@ Network::FilterStatus Filter::initializeUpstreamConnection() { | |||
} | |||
|
|||
bool Filter::maybeTunnel(Upstream::ThreadLocalCluster& cluster, const std::string& cluster_name) { | |||
std::cerr << " HERE\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Remove debug logging.
* @param info supplies the stream info object associated with the upstream connection. | ||
* @param upstream supplies the generic upstream for the stream. | ||
* @param host supplies the description of the host that will carry the request. | ||
* @param upstream_local_address supplies the local address of the upstream connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: documentation and argument names disagree: upstream_local_address vs local_address
@snowp or @mattklein123 up for second pass? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks this looks good for the most part, just a few comments
include/envoy/tcp/upstream.h
Outdated
* | ||
* @param callbacks callbacks to communicate stream failure or creation on. | ||
*/ | ||
virtual void newStream(GenericConnectionPoolCallbacks* callbacks) PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this take a ref instead of a pointer?
include/envoy/tcp/upstream.h
Outdated
*/ | ||
virtual GenericConnPoolPtr | ||
createGenericConnPool(const std::string& cluster_name, Upstream::ClusterManager& cm, | ||
absl::optional<std::string> hostname, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const ref
source/common/tcp_proxy/tcp_proxy.cc
Outdated
} | ||
|
||
absl::optional<std::string> hostname = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does a string copy, which means two string copies here since hostname is passed by value to createGenericConnPool
. Might not matter in this context
A higher level comment: should we do the factory lookup + partially populate the arguments during filter factory creation? That way we avoid factory lookup and string copies etc in the data path
@snowp I think I finally got CI sorted, PTAL! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me! Just a few small comments
include/envoy/tcp/upstream.h
Outdated
* | ||
* @param disable true if the stream should be read disabled, false otherwise. | ||
* @return returns true if the disable is performed, false otherwise | ||
(e.g. if the connection is closed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: missing *
include/envoy/tcp/upstream.h
Outdated
* @param config the tunneling config, if doing connect tunneling. | ||
* @param context the load balancing context for this connection. | ||
* @param upstream_callbacks the callbacks to provide to the connection if successfully created. | ||
* @return may be null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when would this be null?
source/common/config/utility.h
Outdated
@@ -230,6 +231,33 @@ class Utility { | |||
return *factory; | |||
} | |||
|
|||
/** | |||
* Get a Factory from the registry with a particular name or return nullptr. | |||
* @param name string identifier for the particular implementation. Note: this is a proto string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this is just a regular string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@htuch @mattklein123 @lizan I can has API stamp? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API LGTM
Risk Level: Medium (some refactory)
Testing: existing tests pass
Docs Changes: n/a
Release Notes: n/a
Fixes #13185