Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket: IP_FREEBIND support for listeners and upstream connections. #2922

Merged
merged 1 commit into from
Mar 29, 2018

Conversation

htuch
Copy link
Member

@htuch htuch commented Mar 28, 2018

This patch introduces support for setting IP_FREEBIND on both listener sockets and upstream
connection sockets prior to binding. This enables the use of IP addresses that are not currently
bound to the NIC for listening and initiating connections from. This is useful in environments with
virtualized networking.

There's also some related work on SocketOption that continues from #2734, which was needed to enable
this to work cleanly.

Risk Level: Low (no change unless enabled).
Testing: Unit tests for ListenerManager, ClusterManager and SocketOptionImpl. Manual end-to-end
validation with steps described in configs/freebind/README.md.
API Changes: envoyproxy/data-plane-api#536

Fixes #528.

Signed-off-by: Harvey Tuch htuch@google.com

@htuch
Copy link
Member Author

htuch commented Mar 28, 2018

@jrajahalme @rlenglet this is the SocketOption work I've been working on, we should figure out how to reconcile with #2719, which I based some parts of this on.

@@ -0,0 +1,39 @@
admin:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick drive by: Can we test this config (just that it loads) in our config tests? Along with the original_dst config? From a quick look it doesn't seem like they are being tested for config sanity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take this as a follow up action item, I have a WiP to fix this, but it's a bit complicated because we're using a MockListenerComponentFactory, which is not compatible with socket options. I'd like to unblock #2719.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK SGTM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - let's avoid text config because when it stops working we won't be alerted but a TODO is fine.

I think we do need to be able to test socket options, esp with all the other PRs in flight

htuch added a commit to htuch/envoy-api that referenced this pull request Mar 28, 2018
Corresponding PR envoyproxy/envoy#2922.

Signed-off-by: Harvey Tuch <htuch@google.com>
Copy link
Contributor

@rlenglet rlenglet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SocketOptionImpl design LGTM.


class SocketOptionImpl : public Socket::Option, Logger::Loggable<Logger::Id::connection> {
public:
SocketOptionImpl(bool freebind) : freebind_(freebind) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, when adding support for another option, we'll only need to modify this library here and 2 subclasses (UpstreamSocketOption, and ListenerSocketOption). Is that correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. You only need to modify here if it's a shared option, and specifically in the subclasses if it's listener or upstream only. I haven't fully plumbed the hash stuff, I'll leave that for you folks.

namespace Network {

bool SocketOptionImpl::setOption(Socket& socket, Socket::SocketState state) const {
if (state == Socket::SocketState::PreBind) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also move the SOL_SOCKET, SO_REUSEADDR option from TcpListenSocket::TcpListenSocket, so that all socket options are set in this one library.
This will require either defining a new setTcpSocketOption method or passing setsockopt's level as a parameter to use SOL_SOCKET.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave a TODO, I think this is orthogonal and that setsockopt doesn't cause any #ifdef tangle. So, in the interest of unblocking #2719 will skip for now.

Copy link
Contributor

@rlenglet rlenglet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch I think you should also update the JSON parser to add support for the freebind option, cf. for example what I've done for the transparent option in #2719:
https://github.com/envoyproxy/envoy/pull/2719/files#diff-ee0a88c03932b3f263f41d176e762f1aR60
https://github.com/envoyproxy/envoy/pull/2719/files#diff-dcd9a089191ce4c6b1175d44e65ac9b0R188

@ggreenway
Copy link
Contributor

@rlenglet We're trying very hard to not add any more v1 config changes, as it is deprecated.

Copy link
Contributor

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know some of this work is to make it easier for the other socket options in the queue to be added. Can you give a brief overview of what those additions will look like? Specifically, I'm not understanding whether SocketOptionImpl is going to be extended with other options, or whether those should go in separate classes.

@@ -55,13 +55,14 @@ class Socket {
*/
virtual void hashKey(std::vector<uint8_t>& key) const PURE;
};
typedef std::unique_ptr<Option> OptionPtr;
typedef std::shared_ptr<std::vector<OptionPtr>> OptionsSharedPtr;
typedef std::shared_ptr<const Option> OptionConstSharedPtr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on why this needs to be a shared_ptr (not unique)? I skimmed the code but didn't see where/why they need to be shared.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, why dos this need to be shared?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now an immutable object which benefits from shared const ownership; we do this already for address instances and some other cases. unique_ptr was problematic given how the OptionPtr vector is combined in https://github.com/envoyproxy/envoy/pull/2922/files#diff-cd280606785949980237b2a7f08b7885R95, you would need some way to copy/clone the objects across the lists otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you need the shared_ptr in the vector, the reason I didn't like was a shared_ptr of shared_ptr vector OptionsSharedPtr, if it is inevitable I'm fine.

@@ -26,9 +26,9 @@ class SocketImpl : public virtual Socket {
fd_ = -1;
}
}
void addOption(OptionPtr&& option) override {
void addOption(OptionConstSharedPtr&& option) override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a shared_ptr, this probably shouldn't be an rvalue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this was global search/replace snafu.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably already saw but this issue is in a bunch of other places.

SocketOptionName ipv6_optname, const void* optval, socklen_t optlen);

private:
const bool freebind_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an optional, to match the type in data-plane-api?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I went back-and-forth on whether we should push the optional logic to the client or put it here. I think given my understanding of how SocketOptions are intended to be combined, we probably should make it optional here, so I'll make this change.

@rlenglet
Copy link
Contributor

If there's more work to be done here, could we try to get #2719 merged first?

@htuch
Copy link
Member Author

htuch commented Mar 29, 2018

@rlenglet I think in order to get through review for #2719 we'd need to fixup a bunch of stuff that is in this PR. I don't think there's more work to be done in this specific PR; I have two followups that I'm going to take to future PRs: (1) address @mattklein123 ask for freebind.yaml to be config sanity checked in tests and (2) think about how to test outbound socket freebind with packets flowing (I can manually validate the bind/bind fail there).

So, we should be able to move quickly here.

- socket_address:
address: 127.0.0.1
port_value: 10001
# upstream_bind_config:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean for this to be commented out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it relates to my TODO on figuring out how to test this working with packets flowing end-to-end. I was mucking around with iptables today, but it seems that basic NATing isn't sufficient. I'll add an explicit TODO there as well.

if (options) {
*cluster_options = *options;
}
if (cluster.features() & ClusterInfo::Features::FREEBIND) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is redundant, it is already met in line 94.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that was some forest-and-trees code, will remove.

@@ -55,13 +55,14 @@ class Socket {
*/
virtual void hashKey(std::vector<uint8_t>& key) const PURE;
};
typedef std::unique_ptr<Option> OptionPtr;
typedef std::shared_ptr<std::vector<OptionPtr>> OptionsSharedPtr;
typedef std::shared_ptr<const Option> OptionConstSharedPtr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you need the shared_ptr in the vector, the reason I didn't like was a shared_ptr of shared_ptr vector OptionsSharedPtr, if it is inevitable I'm fine.

if (socket.localAddress()) {
ip = socket.localAddress()->ip();
} else {
address = Address::addressFromFd(socket.fd());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we not have a localAddress() here? Can you add some more comments? (Just wondering why we can't just check if this is an IP socket and just return if not -- why do we need to try to get the address from the fd)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bummer. I would probably add a TODO here to look into cleaning this up. Optimally, localAddress() would be available unconditionally before we call this code so we can avoid this exception stuff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it's possible to clean this up; we only know the full localAddress() after we have done the bind/connect AFAIK, since that determines the local port for outgoing connections. I'll add a TODO to provide the IP version earlier, which we can do.

@htuch
Copy link
Member Author

htuch commented Mar 29, 2018

@ggreenway the context is #2719, that's where additional options start to appear (specifically on the Listener side). I actually based this on #2719 patch and then added IP_FREEBIND, generalized/refactored, and then removed the IP_TRANSPARENT bits.

}

// Socket::Option implementation for API-defined listener upstream options.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "listener upstream option?" Clarify?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a copy-paste snafu, fixed.


/**
* Add a socket option visitor for later retrieval with options().
*/
virtual void addOption(OptionPtr&&) PURE;
virtual void addOption(OptionConstSharedPtr) PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this should probably be const OptionConstSharedPtr& here and elsewhere similar.

* platform for fd after the above option level fallback semantics are taken into account or the
* socket is non-IP.
*/
static int setIpSocketOption(Socket& socket, SocketOptionName ipv4_optname,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a non-actionable thought/comment: but I do feel like this series of changes is increasingly putting more UNIX/Linux stuff directly in the codebase and I wish we were thinking a bit more about x-platform. I would not worry about it now but just raising as food for thought.

Copy link
Member Author

@htuch htuch Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR deals reasonably with the POSIXy side of the world. I agree that in general though, we would benefit from having something like source/platform that moves these specifics out of the Envoy core. It seems inevitable that any sufficiently advanced proxy will have features that are deeply tied to specific kernel features, so we can't be too abstract.

What are the other non-POSIXy platforms we care about besides Windows?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realistically I think only Windows right now. (I agree this is a very nice abstraction around POSIX socket options that might not exist on the host system.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for platform work early while it's not too painful.
Harvey: consider pinging Randy when he joins the team next week - I wonder if we can steal best practices from the chrome network stack...

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff. Generally LGTM at a high level. Left some drive by comments.

if (freebind_) {
const int option = 1;
if (freebind_.has_value()) {
const int option = freebind_.value();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: converting from bool to int, I'd prefer "freebind_.value() ? 1 : 0". But feel free to disagree with my style preference.

bool SocketOptionImpl::setOption(Socket& socket, Socket::SocketState state) const {
if (state == Socket::SocketState::PreBind) {
if (freebind_.has_value()) {
const int option = freebind_.value();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: converting from bool to int, I'd prefer "freebind_.value() ? 1 : 0". But feel free to disagree with my style preference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also naming: should_freebind? It makes the SetSocketOption() more clear

ggreenway
ggreenway previously approved these changes Mar 29, 2018
Copy link
Contributor

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from one minor nit (that you can choose to ignore if you want), this LGTM

Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for driving this through!

@@ -0,0 +1,39 @@
admin:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - let's avoid text config because when it stops working we won't be alerted but a TODO is fine.

I think we do need to be able to test socket options, esp with all the other PRs in flight

bool SocketOptionImpl::setOption(Socket& socket, Socket::SocketState state) const {
if (state == Socket::SocketState::PreBind) {
if (freebind_.has_value()) {
const int option = freebind_.value();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also naming: should_freebind? It makes the SetSocketOption() more clear

return os_syscalls.setsockopt(socket.fd(), IPPROTO_IP, ipv4_optname.value(), optval, optlen);
}

// If the FD is v6, we first try the IPv6 variant if we can and fallback to the IPv4 variant.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> we either use the IPv6 variant if configured, otherwise we use the IPv6 variant.

We don't really try both.

if (ipv4_optname) {
return os_syscalls.setsockopt(socket.fd(), IPPROTO_IP, ipv4_optname.value(), optval, optlen);
}
return ENOTSUP;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly encourage LOGs for each failure mode. "this failed" is less useful than "you're using an IPv4 socket and forgot and configured an IPv6 only option"

* platform for fd after the above option level fallback semantics are taken into account or the
* socket is non-IP.
*/
static int setIpSocketOption(Socket& socket, SocketOptionName ipv4_optname,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for platform work early while it's not too painful.
Harvey: consider pinging Randy when he joins the team next week - I wonder if we can steal best practices from the chrome network stack...

}
}

TEST_F(SocketOptionImplTest, SetOptionFreeebindSuccessFalse) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: I'm a big fan of per test comments

socklen_t optlen) {
ASSERT(optlen == sizeof(int));

// Allow mock to fail us
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow mocking system call failure?

alyssawilk
alyssawilk previously approved these changes Mar 29, 2018
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if invalid is the correct word in this case :-)

return ENOTSUP;
}

// If the FD is v4, we can only try the IPv4 variant.
if (ip->version() == Network::Address::IpVersion::v4) {
if (!ipv4_optname) {
ENVOY_LOG(warn, "Invalid IPv4 socket option");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalid or unspecified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsupported :D

This patch introduces support for setting IP_FREEBIND on both listener sockets and upstream
connection sockets prior to binding. This enables the use of IP addresses that are not currently
bound to the NIC for listening and initiating connections from. This is useful in environments with
virtualized networking.

There's also some related work on SocketOption that continues from envoyproxy#2734, which was needed to enable
this to work cleanly.

Risk Level: Low (no change unless enabled).
Testing: Unit tests for ListenerManager, ClusterManager and SocketOptionImpl. Manual end-to-end
validation with steps described in configs/freebind/README.md.
API Changes: envoyproxy/data-plane-api#536

Fixes envoyproxy#528.

Signed-off-by: Harvey Tuch <htuch@google.com>
@htuch htuch merged commit 725457c into envoyproxy:master Mar 29, 2018
@htuch htuch deleted the ip-freebind branch March 29, 2018 19:53
mattklein123 pushed a commit to envoyproxy/data-plane-api that referenced this pull request Mar 29, 2018
Corresponding PR envoyproxy/envoy#2922.

Signed-off-by: Harvey Tuch <htuch@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants