Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connection: Set the SOL_IP,IP_TRANSPARENT option on listen sockets #2719

Merged
merged 7 commits into from
Mar 30, 2018
Merged

connection: Set the SOL_IP,IP_TRANSPARENT option on listen sockets #2719

merged 7 commits into from
Mar 30, 2018

Conversation

rlenglet
Copy link
Contributor

@rlenglet rlenglet commented Mar 5, 2018

connection: Set the SOL_IP,IP_TRANSPARENT option on listen sockets

Implement the "transparent" option in the LDS Listener to set the SOL_IP/IP_TRANSPARENT option on listen sockets, which allows using Envoy with the iptables TPROXY target. Unlike the iptables REDIRECT target, TPROXY allows preserving both the source and destination IP addresses and ports of accepted connections.

Update the OriginalDst filter to restore the original destination IP address of TPROXY connections in addition to REDIRECT connections.

Signed-off-by: Romain Lenglet romain@covalent.io
Signed-off-by: Jarno Rajahalme jarno@covalent.io
Risk Level: Low

// instead of REDIRECT to get both the original source and destination addresses of the
// redirected socket.
on = 1;
rc = setsockopt(fd_, SOL_IP, IP_TRANSPARENT, &on, sizeof(on));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be set unconditionally; this should only be done based on some configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@dnoe
Copy link
Contributor

dnoe commented Mar 5, 2018

This seems like it should be configurable per listener. Can you create a PR in https://github.com/envoyproxy/data-plane-api with a proposed modification to LDS configuration?

@rlenglet
Copy link
Contributor Author

rlenglet commented Mar 5, 2018

Will do. Thanks!

@rlenglet
Copy link
Contributor Author

rlenglet commented Mar 5, 2018

Submitted this PR to data-plane-api: envoyproxy/data-plane-api#522.
Will work on implementing the flag.

Copy link
Contributor

@PiotrSikora PiotrSikora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: #2659.

// instead of REDIRECT to get both the original source and destination addresses of the
// redirected socket.
on = 1;
rc = setsockopt(fd_, SOL_IP, IP_TRANSPARENT, &on, sizeof(on));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, this should be IP_TRANSPARENT for IPv4 and IPV6_TRANSPARENT for IPv6.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in upcoming commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this won't work for IPv6 when building using Envoy's Docker image (Ubuntu 16.04.3 with libc6-dev 2.23-0ubuntu10) because IPV6_TRANSPARENT is not defined in that image:

$ ./ci/run_envoy_docker.sh 'grep IPV6_TRANSPARENT /usr/include/x86_64-linux-gnu/bits/in.h'

returns nothing.

On Ubuntu 17.10 with libc6-dev 2.26-0ubuntu2.1, this macro is defined as expected:

$ grep IPV6_TRANSPARENT /usr/include/x86_64-linux-gnu/bits/in.h
#define IPV6_TRANSPARENT	75

rc = setsockopt(fd_, SOL_IP, IP_TRANSPARENT, &on, sizeof(on));
if (rc == -1) {
if (errno == EPERM) {
// Ignore. Not running with CAP_NET_ADMIN capability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still a real error, if configuration requires IP_TRANSPARENT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in upcoming commit.

@@ -37,6 +38,22 @@ TcpListenSocket::TcpListenSocket(const Address::InstanceConstSharedPtr& address,
int rc = setsockopt(fd_, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on));
RELEASE_ASSERT(rc != -1);

#if defined SOL_IP && defined IP_TRANSPARENT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this can be moved into if (bind_to_port) { ... }, since as far as I recall, it's useless without bind().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could make the same argument about the existing setsockopt(fd_, SOL_SOCKET, SO_REUSEADDR, ...) above. Should I move all setsockopt calls into if (bind_to_port) { ... }?

// instead of REDIRECT to get both the original source and destination addresses of the
// redirected socket.
on = 1;
rc = setsockopt(fd_, SOL_IP, IP_TRANSPARENT, &on, sizeof(on));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@rlenglet
Copy link
Contributor Author

rlenglet commented Mar 6, 2018

Re: #2659, this is a different, complementary issue. You need first to implement IP_TRANSPARENT in the inbound listen socket anyway, like I do here, so that you get the original source address. Without that, you won't know what address to bind to.

@rlenglet
Copy link
Contributor Author

rlenglet commented Mar 6, 2018

The mac tests timed out. It doesn't seem to be a functional failure. Is that something I should look into?

htuch pushed a commit to envoyproxy/data-plane-api that referenced this pull request Mar 6, 2018
…#522)

Add a "transparent" option to Listener to set the SOL_IP/IP_TRANSPARENT option on listen sockets, which allows using Envoy with the iptables TPROXY target.
Unlike the iptables REDIRECT target, TPROXY allows preserving both the source and destination IP addresses and ports of accepted connections.

API changes for: envoyproxy/envoy#2719

Signed-off-by: Romain Lenglet <romain@covalent.io>
@htuch
Copy link
Member

htuch commented Mar 6, 2018

@rlenglet probably #2428, let's see if that is fixed here.

rlenglet added a commit to cilium/cilium that referenced this pull request Mar 8, 2018
Pull a quick-and-dirty implementation of the SOL_IP,IP_TRANSPARENT
option setup on listen sockets:
https://github.com/rlenglet/envoy/tree/quick-and-dirty-transparent
This will be reverted when the proper, clean change is merged
upstream into Envoy:
envoyproxy/envoy#2719

Signed-off-by: Romain Lenglet <romain@covalent.io>
@htuch
Copy link
Member

htuch commented Mar 8, 2018

@rlenglet can you merge to resolve conflict?
@PiotrSikora @dnoe @ggreenway any further comments?

* @param bind_to_port supplies whether to actually bind the socket.
* @return Network::SocketSharedPtr an initialized and potentially bound socket.
*/
virtual Network::SocketSharedPtr
createListenSocket(Network::Address::InstanceConstSharedPtr address, bool bind_to_port) PURE;
createListenSocket(Network::Address::InstanceConstSharedPtr address, bool transparent,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Do we foresee having more options like this in the future? If so should we switch to some type of options struct that gets passed around? That would a avoid having to change every individual callsite and test again in the future potentially. Just throwing it out there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't foresee more options like this. But if we think that could happen, I could use #2734 to implement this internally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't foresee more options like this than I wouldn't worry about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one more option coming that is similar/related: enabling tcp-fastopen on listening sockets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggreenway So what is your preference? Should we implement this with #2734?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question.. how does IP_FREEBIND differ from IP_TRANSPARENT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggreenway Would it work for you better if I generalized the current option setting code (i.e., #2734 ) to accept multiple option setters, maintain a list of set options and have the call sites call all of them (unless one of them fails)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnoe OK SGTM if you want to do the struct stuff.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrajahalme yes that sounds reasonable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FREEBIND allows binding to an address that isn't configured on any interface on the machine. IP_TRANSPARENT would also allow this, but from inside Envoy should otherwise work the same as a regular listener (whereas transparent is designed around a different use case where the traffic has been redirected to Envoy and has some original destination that is meaningful).

dnoe
dnoe previously approved these changes Mar 8, 2018
Copy link
Contributor

@dnoe dnoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Copy link
Contributor

@PiotrSikora PiotrSikora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there are no tests (due to iptables), did anybody other than the author tested that it really works?

Another questions is, how does this interact with SO_ORIGINAL_DST and PROXY protocol?

EXPECT_TRUE(manager_->addOrUpdateListener(parseListenerFromJson(listener_foo_json), true));
checkStats(1, 0, 0, 0, 1, 0);

// Update foo listener, but with a different transparent option. Should throw.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we cannot disable IP_TRANSPARENT for the new listener?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora I was thinking that we have an atomicity issue. How do we atomically hand off the socket to the new listener and set/unset the option? That would require that there is a period of time when no thread is accepting on the socket during the handoff. I don't think we have that (?). And that's not desirable, since that would cause connections to be rejected during that period.

Copy link
Contributor

@PiotrSikora PiotrSikora Mar 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rlenglet there is always a period of time when no thread is accepting new connections (i.e. when all threads are busy processing active requests), and no connections are rejected because of kernel-maintained backlog queue on the listening socket, so I don't think lack of atomicity is the problem.

Having said that, I'm fine if you want to leave it as-is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm removing that verification. We know from an accepted socket's option whether it has been TPROXYed or not, so we can safely have a mix of connections with and without TPROXY in the same queue.

@rlenglet
Copy link
Contributor Author

rlenglet commented Mar 8, 2018

@PiotrSikora It should have no interaction with SO_ORIGINAL_DST or PROXY protocol. It basically has no effect unless one uses iptables TPROXY to redirect connections to the socket.

@PiotrSikora
Copy link
Contributor

@rlenglet my point was what happens if you use IP_TRANSPARENT with SO_ORIGINAL_DST and/or PROXY protocol? I believe that PROXY protocol will just work fine (and take priority), but the behavior of SO_ORIGINAL_DST is unclear, and if the two are incompatible, then perhaps config guard is in order?

@htuch
Copy link
Member

htuch commented Mar 28, 2018

@rlenglet can you take a look at #2922? I started with some of the pieces in this PR, but there was a bunch of cleanup I did to make it (1) avoid the #ifdef proliferation, (2) add support for upstream as well as listener and (3) work with additional options without boiletplate. I think it might make sense to merge that one first and then this present PR should be much simpler.

@rlenglet
Copy link
Contributor Author

@htuch If you can get #2922 merged soon, I'm fine with it getting merged first.

@htuch
Copy link
Member

htuch commented Mar 29, 2018

@rlenglet #2922 has merged.

@rlenglet
Copy link
Contributor Author

@htuch Great! I'll merge from master today.

…nsparent

Signed-off-by: Romain Lenglet <romain@covalent.io>
@rlenglet
Copy link
Contributor Author

@htuch Please review. I merged from master.
I fixed the calls to the SocketOptionImpl constructor to pass empty optionals when the options were not set.

rlenglet added a commit to cilium/cilium that referenced this pull request Mar 30, 2018
Pull a quick-and-dirty implementation of the SOL_IP,IP_TRANSPARENT
option setup on listen sockets:
https://github.com/rlenglet/envoy/tree/quick-and-dirty-transparent
This will be reverted when the proper, clean change is merged
upstream into Envoy:
envoyproxy/envoy#2719

Signed-off-by: Romain Lenglet <romain@covalent.io>
Signed-off-by: Romain Lenglet <romain@covalent.io>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, the implementation is ready to ship. Can you add a listener_manager_impl_test similar to what is done for freebind and was done in earlier revisions?

@mattklein123
Copy link
Member

Love, love, love how simple this all became. Great work @ggreenway @PiotrSikora @htuch @rlenglet!

@rlenglet can we get the doc/release note PR redone before we merge this? Thank you.

Signed-off-by: Romain Lenglet <romain@covalent.io>
Signed-off-by: Romain Lenglet <romain@covalent.io>
@rlenglet
Copy link
Contributor Author

@htuch Added tests. Looks clean.

@rlenglet
Copy link
Contributor Author

@mattklein123 Here is the release note PR: envoyproxy/data-plane-api#593
Thanks!

@rlenglet
Copy link
Contributor Author

rlenglet commented Mar 30, 2018

The ipv6_tests failure is due to the test environment:

ERROR: /source/bazel/repositories.bzl:203:5: no such package '@boringssl//': Traceback (most recent call last):
	File "/build/tmp/_bazel_bazel/436badd4919a15958fa3800a4e21074a/external/bazel_tools/tools/build_defs/repo/git.bzl", line 69
		_clone_or_update(ctx)
	File "/build/tmp/_bazel_bazel/436badd4919a15958fa3800a4e21074a/external/bazel_tools/tools/build_defs/repo/git.bzl", line 44, in _clone_or_update
		fail(("error cloning %s:\n%s" % (ctx....)))
error cloning boringssl:
+ cd /build/tmp/_bazel_bazel/436badd4919a15958fa3800a4e21074a/external
+ rm -rf /build/tmp/_bazel_bazel/436badd4919a15958fa3800a4e21074a/external/boringssl
+ git clone https://boringssl.googlesource.com/boringssl /build/tmp/_bazel_bazel/436badd4919a15958fa3800a4e21074a/external/boringssl
Cloning into '/build/tmp/_bazel_bazel/436badd4919a15958fa3800a4e21074a/external/boringssl'...
fatal: unable to access 'https://boringssl.googlesource.com/boringssl/': Failed to connect to boringssl.googlesource.com port 443: Network is unreachable
 and referenced by '//external:ssl'

@@ -82,6 +126,12 @@ TEST_F(SocketOptionImplTest, SetOptionFreeebindSuccessFalse) {
}
}

// Transparent settings have no effect on post-bind behavior.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought transparent does have an effect on both prebind and postbind?

Copy link
Contributor Author

@rlenglet rlenglet Mar 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I will relax that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a commit to support setting IP_TRANSPARENT both pre-bind and post-bind.

Signed-off-by: Romain Lenglet <romain@covalent.io>
Signed-off-by: Romain Lenglet <romain@covalent.io>
Copy link
Contributor

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; just questioning whether we can remove the v1 support.

@@ -185,6 +185,7 @@ const std::string Json::Schema::LISTENER_SCHEMA(R"EOF(
},
"drain_type": {"type" : "string", "enum" : ["default", "modify_only"]},
"ssl_context" : {"$ref" : "#/definitions/ssl_context"},
"transparent" : {"type": "boolean"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the v1 API absolutely required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need it for Istio. Cf. istio/istio#4654 (review)

@htuch htuch merged commit 3b5fb0e into envoyproxy:master Mar 30, 2018
@htuch
Copy link
Member

htuch commented Mar 30, 2018

Thanks @rlenglet for all your patience here.

@rlenglet rlenglet deleted the listen-socket-transparent branch March 30, 2018 20:56
@rlenglet
Copy link
Contributor Author

@htuch Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants