-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
socket: IP_FREEBIND support for listeners and upstream connections. #2922
Conversation
@jrajahalme @rlenglet this is the |
@@ -0,0 +1,39 @@ | |||
admin: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick drive by: Can we test this config (just that it loads) in our config tests? Along with the original_dst config? From a quick look it doesn't seem like they are being tested for config sanity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take this as a follow up action item, I have a WiP to fix this, but it's a bit complicated because we're using a MockListenerComponentFactory, which is not compatible with socket options. I'd like to unblock #2719.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK SGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 - let's avoid text config because when it stops working we won't be alerted but a TODO is fine.
I think we do need to be able to test socket options, esp with all the other PRs in flight
Corresponding PR envoyproxy/envoy#2922. Signed-off-by: Harvey Tuch <htuch@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SocketOptionImpl
design LGTM.
|
||
class SocketOptionImpl : public Socket::Option, Logger::Loggable<Logger::Id::connection> { | ||
public: | ||
SocketOptionImpl(bool freebind) : freebind_(freebind) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, when adding support for another option, we'll only need to modify this library here and 2 subclasses (UpstreamSocketOption
, and ListenerSocketOption
). Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. You only need to modify here if it's a shared option, and specifically in the subclasses if it's listener or upstream only. I haven't fully plumbed the hash stuff, I'll leave that for you folks.
namespace Network { | ||
|
||
bool SocketOptionImpl::setOption(Socket& socket, Socket::SocketState state) const { | ||
if (state == Socket::SocketState::PreBind) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also move the SOL_SOCKET, SO_REUSEADDR
option from TcpListenSocket::TcpListenSocket
, so that all socket options are set in this one library.
This will require either defining a new setTcpSocketOption
method or passing setsockopt
's level
as a parameter to use SOL_SOCKET
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave a TODO, I think this is orthogonal and that setsockopt doesn't cause any #ifdef tangle. So, in the interest of unblocking #2719 will skip for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@htuch I think you should also update the JSON parser to add support for the freebind
option, cf. for example what I've done for the transparent
option in #2719:
https://github.com/envoyproxy/envoy/pull/2719/files#diff-ee0a88c03932b3f263f41d176e762f1aR60
https://github.com/envoyproxy/envoy/pull/2719/files#diff-dcd9a089191ce4c6b1175d44e65ac9b0R188
@rlenglet We're trying very hard to not add any more v1 config changes, as it is deprecated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know some of this work is to make it easier for the other socket options in the queue to be added. Can you give a brief overview of what those additions will look like? Specifically, I'm not understanding whether SocketOptionImpl is going to be extended with other options, or whether those should go in separate classes.
@@ -55,13 +55,14 @@ class Socket { | |||
*/ | |||
virtual void hashKey(std::vector<uint8_t>& key) const PURE; | |||
}; | |||
typedef std::unique_ptr<Option> OptionPtr; | |||
typedef std::shared_ptr<std::vector<OptionPtr>> OptionsSharedPtr; | |||
typedef std::shared_ptr<const Option> OptionConstSharedPtr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you comment on why this needs to be a shared_ptr (not unique)? I skimmed the code but didn't see where/why they need to be shared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, why dos this need to be shared?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now an immutable object which benefits from shared const ownership; we do this already for address instances and some other cases. unique_ptr
was problematic given how the OptionPtr
vector is combined in https://github.com/envoyproxy/envoy/pull/2922/files#diff-cd280606785949980237b2a7f08b7885R95, you would need some way to copy/clone the objects across the lists otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see where you need the shared_ptr in the vector, the reason I didn't like was a shared_ptr of shared_ptr vector OptionsSharedPtr
, if it is inevitable I'm fine.
@@ -26,9 +26,9 @@ class SocketImpl : public virtual Socket { | |||
fd_ = -1; | |||
} | |||
} | |||
void addOption(OptionPtr&& option) override { | |||
void addOption(OptionConstSharedPtr&& option) override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a shared_ptr, this probably shouldn't be an rvalue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, this was global search/replace snafu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably already saw but this issue is in a bunch of other places.
SocketOptionName ipv6_optname, const void* optval, socklen_t optlen); | ||
|
||
private: | ||
const bool freebind_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an optional, to match the type in data-plane-api?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I went back-and-forth on whether we should push the optional logic to the client or put it here. I think given my understanding of how SocketOptions are intended to be combined, we probably should make it optional here, so I'll make this change.
If there's more work to be done here, could we try to get #2719 merged first? |
@rlenglet I think in order to get through review for #2719 we'd need to fixup a bunch of stuff that is in this PR. I don't think there's more work to be done in this specific PR; I have two followups that I'm going to take to future PRs: (1) address @mattklein123 ask for freebind.yaml to be config sanity checked in tests and (2) think about how to test outbound socket freebind with packets flowing (I can manually validate the bind/bind fail there). So, we should be able to move quickly here. |
configs/freebind/freebind.yaml
Outdated
- socket_address: | ||
address: 127.0.0.1 | ||
port_value: 10001 | ||
# upstream_bind_config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean for this to be commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it relates to my TODO on figuring out how to test this working with packets flowing end-to-end. I was mucking around with iptables
today, but it seems that basic NATing isn't sufficient. I'll add an explicit TODO there as well.
if (options) { | ||
*cluster_options = *options; | ||
} | ||
if (cluster.features() & ClusterInfo::Features::FREEBIND) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is redundant, it is already met in line 94.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that was some forest-and-trees code, will remove.
@@ -55,13 +55,14 @@ class Socket { | |||
*/ | |||
virtual void hashKey(std::vector<uint8_t>& key) const PURE; | |||
}; | |||
typedef std::unique_ptr<Option> OptionPtr; | |||
typedef std::shared_ptr<std::vector<OptionPtr>> OptionsSharedPtr; | |||
typedef std::shared_ptr<const Option> OptionConstSharedPtr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see where you need the shared_ptr in the vector, the reason I didn't like was a shared_ptr of shared_ptr vector OptionsSharedPtr
, if it is inevitable I'm fine.
if (socket.localAddress()) { | ||
ip = socket.localAddress()->ip(); | ||
} else { | ||
address = Address::addressFromFd(socket.fd()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When do we not have a localAddress() here? Can you add some more comments? (Just wondering why we can't just check if this is an IP socket and just return if not -- why do we need to try to get the address from the fd)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is during upstream connection in https://github.com/envoyproxy/envoy/blob/master/source/common/network/connection_impl.cc#L543, we don't set the local address until https://github.com/envoyproxy/envoy/blob/master/source/common/network/connection_impl.cc#L593. I will add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a bummer. I would probably add a TODO here to look into cleaning this up. Optimally, localAddress() would be available unconditionally before we call this code so we can avoid this exception stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it's possible to clean this up; we only know the full localAddress()
after we have done the bind/connect AFAIK, since that determines the local port for outgoing connections. I'll add a TODO to provide the IP version earlier, which we can do.
@ggreenway the context is #2719, that's where additional options start to appear (specifically on the Listener side). I actually based this on #2719 patch and then added |
} | ||
|
||
// Socket::Option implementation for API-defined listener upstream options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a "listener upstream option?" Clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a copy-paste snafu, fixed.
|
||
/** | ||
* Add a socket option visitor for later retrieval with options(). | ||
*/ | ||
virtual void addOption(OptionPtr&&) PURE; | ||
virtual void addOption(OptionConstSharedPtr) PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this should probably be const OptionConstSharedPtr&
here and elsewhere similar.
* platform for fd after the above option level fallback semantics are taken into account or the | ||
* socket is non-IP. | ||
*/ | ||
static int setIpSocketOption(Socket& socket, SocketOptionName ipv4_optname, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a non-actionable thought/comment: but I do feel like this series of changes is increasingly putting more UNIX/Linux stuff directly in the codebase and I wish we were thinking a bit more about x-platform. I would not worry about it now but just raising as food for thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR deals reasonably with the POSIXy side of the world. I agree that in general though, we would benefit from having something like source/platform
that moves these specifics out of the Envoy core. It seems inevitable that any sufficiently advanced proxy will have features that are deeply tied to specific kernel features, so we can't be too abstract.
What are the other non-POSIXy platforms we care about besides Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realistically I think only Windows right now. (I agree this is a very nice abstraction around POSIX socket options that might not exist on the host system.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for platform work early while it's not too painful.
Harvey: consider pinging Randy when he joins the team next week - I wonder if we can steal best practices from the chrome network stack...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff. Generally LGTM at a high level. Left some drive by comments.
if (freebind_) { | ||
const int option = 1; | ||
if (freebind_.has_value()) { | ||
const int option = freebind_.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: converting from bool to int, I'd prefer "freebind_.value() ? 1 : 0". But feel free to disagree with my style preference.
bool SocketOptionImpl::setOption(Socket& socket, Socket::SocketState state) const { | ||
if (state == Socket::SocketState::PreBind) { | ||
if (freebind_.has_value()) { | ||
const int option = freebind_.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: converting from bool to int, I'd prefer "freebind_.value() ? 1 : 0". But feel free to disagree with my style preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also naming: should_freebind? It makes the SetSocketOption() more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from one minor nit (that you can choose to ignore if you want), this LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for driving this through!
@@ -0,0 +1,39 @@ | |||
admin: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 - let's avoid text config because when it stops working we won't be alerted but a TODO is fine.
I think we do need to be able to test socket options, esp with all the other PRs in flight
bool SocketOptionImpl::setOption(Socket& socket, Socket::SocketState state) const { | ||
if (state == Socket::SocketState::PreBind) { | ||
if (freebind_.has_value()) { | ||
const int option = freebind_.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also naming: should_freebind? It makes the SetSocketOption() more clear
return os_syscalls.setsockopt(socket.fd(), IPPROTO_IP, ipv4_optname.value(), optval, optlen); | ||
} | ||
|
||
// If the FD is v6, we first try the IPv6 variant if we can and fallback to the IPv4 variant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> we either use the IPv6 variant if configured, otherwise we use the IPv6 variant.
We don't really try both.
if (ipv4_optname) { | ||
return os_syscalls.setsockopt(socket.fd(), IPPROTO_IP, ipv4_optname.value(), optval, optlen); | ||
} | ||
return ENOTSUP; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly encourage LOGs for each failure mode. "this failed" is less useful than "you're using an IPv4 socket and forgot and configured an IPv6 only option"
* platform for fd after the above option level fallback semantics are taken into account or the | ||
* socket is non-IP. | ||
*/ | ||
static int setIpSocketOption(Socket& socket, SocketOptionName ipv4_optname, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for platform work early while it's not too painful.
Harvey: consider pinging Randy when he joins the team next week - I wonder if we can steal best practices from the chrome network stack...
} | ||
} | ||
|
||
TEST_F(SocketOptionImplTest, SetOptionFreeebindSuccessFalse) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: I'm a big fan of per test comments
test/mocks/api/mocks.cc
Outdated
socklen_t optlen) { | ||
ASSERT(optlen == sizeof(int)); | ||
|
||
// Allow mock to fail us |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allow mocking system call failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if invalid is the correct word in this case :-)
return ENOTSUP; | ||
} | ||
|
||
// If the FD is v4, we can only try the IPv4 variant. | ||
if (ip->version() == Network::Address::IpVersion::v4) { | ||
if (!ipv4_optname) { | ||
ENVOY_LOG(warn, "Invalid IPv4 socket option"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invalid or unspecified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsupported :D
This patch introduces support for setting IP_FREEBIND on both listener sockets and upstream connection sockets prior to binding. This enables the use of IP addresses that are not currently bound to the NIC for listening and initiating connections from. This is useful in environments with virtualized networking. There's also some related work on SocketOption that continues from envoyproxy#2734, which was needed to enable this to work cleanly. Risk Level: Low (no change unless enabled). Testing: Unit tests for ListenerManager, ClusterManager and SocketOptionImpl. Manual end-to-end validation with steps described in configs/freebind/README.md. API Changes: envoyproxy/data-plane-api#536 Fixes envoyproxy#528. Signed-off-by: Harvey Tuch <htuch@google.com>
Corresponding PR envoyproxy/envoy#2922. Signed-off-by: Harvey Tuch <htuch@google.com>
This patch introduces support for setting IP_FREEBIND on both listener sockets and upstream
connection sockets prior to binding. This enables the use of IP addresses that are not currently
bound to the NIC for listening and initiating connections from. This is useful in environments with
virtualized networking.
There's also some related work on SocketOption that continues from #2734, which was needed to enable
this to work cleanly.
Risk Level: Low (no change unless enabled).
Testing: Unit tests for ListenerManager, ClusterManager and SocketOptionImpl. Manual end-to-end
validation with steps described in configs/freebind/README.md.
API Changes: envoyproxy/data-plane-api#536
Fixes #528.
Signed-off-by: Harvey Tuch htuch@google.com