Skip to content

Commit

Permalink
socket: IP_FREEBIND support for listeners and upstream connections. (#…
Browse files Browse the repository at this point in the history
…2922)

This patch introduces support for setting IP_FREEBIND on both listener sockets and upstream
connection sockets prior to binding. This enables the use of IP addresses that are not currently
bound to the NIC for listening and initiating connections from. This is useful in environments with
virtualized networking.

There's also some related work on SocketOption that continues from #2734, which was needed to enable
this to work cleanly.

Risk Level: Low (no change unless enabled).
Testing: Unit tests for ListenerManager, ClusterManager and SocketOptionImpl. Manual end-to-end
validation with steps described in configs/freebind/README.md.
API Changes: envoyproxy/data-plane-api#536

Fixes #528.

Signed-off-by: Harvey Tuch <htuch@google.com>
  • Loading branch information
htuch authored Mar 29, 2018
1 parent 116e7c3 commit 725457c
Show file tree
Hide file tree
Showing 42 changed files with 799 additions and 76 deletions.
2 changes: 1 addition & 1 deletion bazel/repository_locations.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ REPOSITORY_LOCATIONS = dict(
urls = ["https://github.com/google/protobuf/archive/v3.5.0.tar.gz"],
),
envoy_api = dict(
commit = "e73b8264a26c909f75d38ec2c73d9f47ad1b3b57",
commit = "ec157d3d8b359a1bd65c22116ba4387a459cc53a",
remote = "https://github.com/envoyproxy/data-plane-api",
),
grpc_httpjson_transcoding = dict(
Expand Down
14 changes: 14 additions & 0 deletions configs/freebind/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Freebind testing

To manually validate the `IP_FREEBIND` behavior in Envoy, you can launch Envoy with
[freebind.yaml](freebind.yaml).

The listener free bind behavior can be verified with:

1. `envoy -c ./configs/freebind/freebind.yaml -l trace`
2. `sudo ifconfig lo:1 192.168.42.1/30 up`
3. `nc -v -l 0.0.0.0 10001`

To cleanup run `sudo ifconfig lo:1 down`.

TODO(htuch): Steps to verify upstream behavior.
41 changes: 41 additions & 0 deletions configs/freebind/freebind.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
admin:
access_log_path: /tmp/admin_access.log
address:
socket_address: { address: 127.0.0.1, port_value: 9901 }

static_resources:
listeners:
- name: listener_0
address:
socket_address: { address: 192.168.42.1, port_value: 10000 }
freebind: true
filter_chains:
- filters:
- name: envoy.http_connection_manager
config:
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match: { prefix: "/" }
route: { cluster: service_local }
http_filters:
- name: envoy.router
clusters:
- name: service_local
connect_timeout: 30s
type: STATIC
lb_policy: ROUND_ROBIN
hosts:
- socket_address:
address: 127.0.0.1
port_value: 10001
# TODO(htuch): Figure out how to do end-to-end testing with
# outgoing connections and free bind.
# upstream_bind_config:
# source_address:
# address: 192.168.43.1
# freebind: true
11 changes: 11 additions & 0 deletions include/envoy/api/os_sys_calls.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,17 @@ class OsSysCalls {
* @see man 2 stat
*/
virtual int stat(const char* pathname, struct stat* buf) PURE;

/**
* @see man 2 setsockopt
*/
virtual int setsockopt(int sockfd, int level, int optname, const void* optval,
socklen_t optlen) PURE;

/**
* @see man 2 getsockopt
*/
virtual int getsockopt(int sockfd, int level, int optname, void* optval, socklen_t* optlen) PURE;
};

typedef std::unique_ptr<OsSysCalls> OsSysCallsPtr;
Expand Down
7 changes: 4 additions & 3 deletions include/envoy/network/listen_socket.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,14 @@ class Socket {
*/
virtual void hashKey(std::vector<uint8_t>& key) const PURE;
};
typedef std::unique_ptr<Option> OptionPtr;
typedef std::shared_ptr<std::vector<OptionPtr>> OptionsSharedPtr;
typedef std::shared_ptr<const Option> OptionConstSharedPtr;
typedef std::vector<OptionConstSharedPtr> Options;
typedef std::shared_ptr<Options> OptionsSharedPtr;

/**
* Add a socket option visitor for later retrieval with options().
*/
virtual void addOption(OptionPtr&&) PURE;
virtual void addOption(const OptionConstSharedPtr&) PURE;

/**
* @return the socket options stored earlier with addOption() calls, if any.
Expand Down
2 changes: 1 addition & 1 deletion include/envoy/server/filter_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ class ListenerFactoryContext : public FactoryContext {
/**
* Store socket options to be set on the listen socket before listening.
*/
virtual void addListenSocketOption(Network::Socket::OptionPtr&& option) PURE;
virtual void addListenSocketOption(const Network::Socket::OptionConstSharedPtr& option) PURE;
};

/**
Expand Down
8 changes: 3 additions & 5 deletions include/envoy/upstream/cluster_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -145,12 +145,10 @@ class ClusterManager {
virtual void shutdown() PURE;

/**
* Returns an optional source address for upstream connections to bind to.
*
* @return Network::Address::InstanceConstSharedPtr a source address to bind to or nullptr if no
* bind need occur.
* @return const envoy::api::v2::core::BindConfig& cluster manager wide bind configuration for new
* upstream connections.
*/
virtual const Network::Address::InstanceConstSharedPtr& sourceAddress() const PURE;
virtual const envoy::api::v2::core::BindConfig& bindConfig() const PURE;

/**
* Return a reference to the singleton ADS provider for upstream control plane muxing of xDS. This
Expand Down
2 changes: 2 additions & 0 deletions include/envoy/upstream/upstream.h
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,8 @@ class ClusterInfo {
// Use the downstream protocol (HTTP1.1, HTTP2) for upstream connections as well, if available.
// This is used when creating connection pools.
static const uint64_t USE_DOWNSTREAM_PROTOCOL = 0x2;
// Use IP_FREEBIND socket option when binding.
static const uint64_t FREEBIND = 0x4;
};

virtual ~ClusterInfo() {}
Expand Down
10 changes: 10 additions & 0 deletions source/common/api/os_sys_calls_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -35,5 +35,15 @@ void* OsSysCallsImpl::mmap(void* addr, size_t length, int prot, int flags, int f

int OsSysCallsImpl::stat(const char* pathname, struct stat* buf) { return ::stat(pathname, buf); }

int OsSysCallsImpl::setsockopt(int sockfd, int level, int optname, const void* optval,
socklen_t optlen) {
return ::setsockopt(sockfd, level, optname, optval, optlen);
}

int OsSysCallsImpl::getsockopt(int sockfd, int level, int optname, void* optval,
socklen_t* optlen) {
return ::getsockopt(sockfd, level, optname, optval, optlen);
}

} // namespace Api
} // namespace Envoy
2 changes: 2 additions & 0 deletions source/common/api/os_sys_calls_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ class OsSysCallsImpl : public OsSysCalls {
int ftruncate(int fd, off_t length) override;
void* mmap(void* addr, size_t length, int prot, int flags, int fd, off_t offset) override;
int stat(const char* pathname, struct stat* buf) override;
int setsockopt(int sockfd, int level, int optname, const void* optval, socklen_t optlen) override;
int getsockopt(int sockfd, int level, int optname, void* optval, socklen_t* optlen) override;
};

typedef ThreadSafeSingleton<OsSysCallsImpl> OsSysCallsSingleton;
Expand Down
14 changes: 14 additions & 0 deletions source/common/network/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,20 @@ envoy_cc_library(
],
)

envoy_cc_library(
name = "socket_option_lib",
srcs = ["socket_option_impl.cc"],
hdrs = ["socket_option_impl.h"],
external_deps = ["abseil_optional"],
deps = [
":address_lib",
"//include/envoy/network:listen_socket_interface",
"//source/common/api:os_sys_calls_lib",
"//source/common/common:assert_lib",
"//source/common/common:logger_lib",
],
)

envoy_cc_library(
name = "utility_lib",
srcs = ["utility.cc"],
Expand Down
1 change: 1 addition & 0 deletions source/common/network/listen_socket_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ TcpListenSocket::TcpListenSocket(const Address::InstanceConstSharedPtr& address,
: ListenSocketImpl(address->socket(Address::SocketType::Stream), address) {
RELEASE_ASSERT(fd_ != -1);

// TODO(htuch): This might benefit from moving to SocketOptionImpl.
int on = 1;
int rc = setsockopt(fd_, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on));
RELEASE_ASSERT(rc != -1);
Expand Down
4 changes: 2 additions & 2 deletions source/common/network/listen_socket_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@ class SocketImpl : public virtual Socket {
fd_ = -1;
}
}
void addOption(OptionPtr&& option) override {
void addOption(const OptionConstSharedPtr& option) override {
if (!options_) {
options_ = std::make_shared<std::vector<OptionPtr>>();
options_ = std::make_shared<std::vector<OptionConstSharedPtr>>();
}
options_->emplace_back(std::move(option));
}
Expand Down
80 changes: 80 additions & 0 deletions source/common/network/socket_option_impl.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
#include "common/network/socket_option_impl.h"

#include "envoy/common/exception.h"

#include "common/api/os_sys_calls_impl.h"
#include "common/common/assert.h"
#include "common/network/address_impl.h"

namespace Envoy {
namespace Network {

bool SocketOptionImpl::setOption(Socket& socket, Socket::SocketState state) const {
if (state == Socket::SocketState::PreBind) {
if (freebind_.has_value()) {
const int should_freebind = freebind_.value() ? 1 : 0;
const int error =
setIpSocketOption(socket, ENVOY_SOCKET_IP_FREEBIND, ENVOY_SOCKET_IPV6_FREEBIND,
&should_freebind, sizeof(should_freebind));
if (error != 0) {
ENVOY_LOG(warn, "Setting IP_FREEBIND on listener socket failed: {}", strerror(error));
return false;
}
}
}

return true;
}

int SocketOptionImpl::setIpSocketOption(Socket& socket, SocketOptionName ipv4_optname,
SocketOptionName ipv6_optname, const void* optval,
socklen_t optlen) {
auto& os_syscalls = Api::OsSysCallsSingleton::get();

// If this isn't IP, we're out of luck.
Address::InstanceConstSharedPtr address;
const Address::Ip* ip = nullptr;
try {
// We have local address when the socket is used in a listener but have to
// infer the IP from the socket FD when initiating connections.
// TODO(htuch): Figure out a way to obtain a consistent interface for IP
// version from socket.
if (socket.localAddress()) {
ip = socket.localAddress()->ip();
} else {
address = Address::addressFromFd(socket.fd());
ip = address->ip();
}
} catch (const EnvoyException&) {
// Ignore, we get here because we failed in getsockname().
// TODO(htuch): We should probably clean up this logic to avoid relying on exceptions.
}
if (ip == nullptr) {
ENVOY_LOG(warn, "Failed to set IP socket option on non-IP socket");
return ENOTSUP;
}

// If the FD is v4, we can only try the IPv4 variant.
if (ip->version() == Network::Address::IpVersion::v4) {
if (!ipv4_optname) {
ENVOY_LOG(warn, "Unsupported IPv4 socket option");
return ENOTSUP;
}
return os_syscalls.setsockopt(socket.fd(), IPPROTO_IP, ipv4_optname.value(), optval, optlen);
}

// If the FD is v6, we first try the IPv6 variant if the platfrom supports it and fallback to the
// IPv4 variant otherwise.
ASSERT(ip->version() == Network::Address::IpVersion::v6);
if (ipv6_optname) {
return os_syscalls.setsockopt(socket.fd(), IPPROTO_IPV6, ipv6_optname.value(), optval, optlen);
}
if (ipv4_optname) {
return os_syscalls.setsockopt(socket.fd(), IPPROTO_IP, ipv4_optname.value(), optval, optlen);
}
ENVOY_LOG(warn, "Unsupported IPv6 socket option");
return ENOTSUP;
}

} // namespace Network
} // namespace Envoy
64 changes: 64 additions & 0 deletions source/common/network/socket_option_impl.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#pragma once

#include <netinet/in.h>
#include <netinet/ip.h>
#include <sys/socket.h>

#include "envoy/network/listen_socket.h"

#include "common/common/logger.h"

#include "absl/types/optional.h"

namespace Envoy {
namespace Network {

// Optional variant of setsockopt(2) optname. The idea here is that if the option is not supported
// on a platform, we can make this the empty value. This allows us to avoid proliferation of #ifdef.
typedef absl::optional<int> SocketOptionName;

#ifdef IP_FREEBIND
#define ENVOY_SOCKET_IP_FREEBIND Network::SocketOptionName(IP_FREEBIND)
#else
#define ENVOY_SOCKET_IP_FREEBIND Network::SocketOptionName()
#endif

#ifdef IPV6_FREEBIND
#define ENVOY_SOCKET_IPV6_FREEBIND Network::SocketOptionName(IPV6_FREEBIND)
#else
#define ENVOY_SOCKET_IPV6_FREEBIND Network::SocketOptionName()
#endif

class SocketOptionImpl : public Socket::Option, Logger::Loggable<Logger::Id::connection> {
public:
SocketOptionImpl(absl::optional<bool> freebind) : freebind_(freebind) {}

// Socket::Option
bool setOption(Socket& socket, Socket::SocketState state) const override;
// The common socket options don't require a hash key.
void hashKey(std::vector<uint8_t>&) const override {}

/**
* Set a socket option that applies at both IPv4 and IPv6 socket levels. When the underlying FD
* is IPv6, this function will attempt to set at IPv6 unless the platform only supports the
* option at the IPv4 level.
* @param socket.
* @param ipv4_optname SocketOptionName for IPv4 level. Set to empty if not supported on
* platform.
* @param ipv6_optname SocketOptionName for IPv6 level. Set to empty if not supported on
* platform.
* @param optval as per setsockopt(2).
* @param optlen as per setsockopt(2).
* @return int as per setsockopt(2). ENOTSUP is returned if the option is not supported on the
* platform for fd after the above option level fallback semantics are taken into account or the
* socket is non-IP.
*/
static int setIpSocketOption(Socket& socket, SocketOptionName ipv4_optname,
SocketOptionName ipv6_optname, const void* optval, socklen_t optlen);

private:
const absl::optional<bool> freebind_;
};

} // namespace Network
} // namespace Envoy
1 change: 1 addition & 0 deletions source/common/upstream/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,7 @@ envoy_cc_library(
"//source/common/http:utility_lib",
"//source/common/network:address_lib",
"//source/common/network:resolver_lib",
"//source/common/network:socket_option_lib",
"//source/common/network:utility_lib",
"//source/common/protobuf",
"//source/common/protobuf:utility_lib",
Expand Down
8 changes: 2 additions & 6 deletions source/common/upstream/cluster_manager_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,8 @@ ClusterManagerImpl::ClusterManagerImpl(const envoy::config::bootstrap::v2::Boots
AccessLog::AccessLogManager& log_manager,
Event::Dispatcher& main_thread_dispatcher)
: factory_(factory), runtime_(runtime), stats_(stats), tls_(tls.allocateSlot()),
random_(random), local_info_(local_info), cm_stats_(generateStats(stats)),
random_(random), bind_config_(bootstrap.cluster_manager().upstream_bind_config()),
local_info_(local_info), cm_stats_(generateStats(stats)),
init_helper_([this](Cluster& cluster) { onClusterInit(cluster); }) {
async_client_manager_ = std::make_unique<Grpc::AsyncClientManagerImpl>(*this, tls);
const auto& cm_config = bootstrap.cluster_manager();
Expand All @@ -187,11 +188,6 @@ ClusterManagerImpl::ClusterManagerImpl(const envoy::config::bootstrap::v2::Boots
eds_config_ = bootstrap.dynamic_resources().deprecated_v1().sds_config();
}

if (bootstrap.cluster_manager().upstream_bind_config().has_source_address()) {
source_address_ = Network::Address::resolveProtoSocketAddress(
bootstrap.cluster_manager().upstream_bind_config().source_address());
}

// Cluster loading happens in two phases: first all the primary clusters are loaded, and then all
// the secondary clusters are loaded. As it currently stands all non-EDS clusters are primary and
// only EDS clusters are secondary. This two phase loading is done because in v2 configuration
Expand Down
6 changes: 2 additions & 4 deletions source/common/upstream/cluster_manager_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -181,9 +181,7 @@ class ClusterManagerImpl : public ClusterManager, Logger::Loggable<Logger::Id::u
active_clusters_.clear();
}

const Network::Address::InstanceConstSharedPtr& sourceAddress() const override {
return source_address_;
}
const envoy::api::v2::core::BindConfig& bindConfig() const override { return bind_config_; }

Config::GrpcMux& adsMux() override { return *ads_mux_; }
Grpc::AsyncClientManager& grpcAsyncClientManager() override { return *async_client_manager_; }
Expand Down Expand Up @@ -309,7 +307,7 @@ class ClusterManagerImpl : public ClusterManager, Logger::Loggable<Logger::Id::u
ClusterMap active_clusters_;
ClusterMap warming_clusters_;
absl::optional<envoy::api::v2::core::ConfigSource> eds_config_;
Network::Address::InstanceConstSharedPtr source_address_;
envoy::api::v2::core::BindConfig bind_config_;
Outlier::EventLoggerSharedPtr outlier_event_logger_;
const LocalInfo::LocalInfo& local_info_;
CdsApiPtr cds_api_;
Expand Down
Loading

0 comments on commit 725457c

Please sign in to comment.