Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssl: add support for SNI. #1984

Merged
merged 53 commits into from
Nov 28, 2017
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
76709a0
ssl: add support for SNI.
PiotrSikora Nov 1, 2017
d04afe2
Merge commit '908ddde3adb0634c966d1170319b8c1f571dab95' into piotrsik…
PiotrSikora Nov 7, 2017
0f984ce
review: skip context update in case of one filter chain.
PiotrSikora Nov 3, 2017
47e3df7
review: use ASSERT().
PiotrSikora Nov 3, 2017
227589b
review: fix comment.
PiotrSikora Nov 4, 2017
75abf26
review: use shared lock.
PiotrSikora Nov 7, 2017
0fecd6b
review: add tests.
PiotrSikora Nov 1, 2017
68541bf
review: add docs.
PiotrSikora Nov 7, 2017
33142cc
review: unordered_map find instead of direct access.
PiotrSikora Nov 7, 2017
ca9eb0b
reivew: fix format.
PiotrSikora Nov 7, 2017
7d1b1af
review: fix non-debug build.
PiotrSikora Nov 7, 2017
fc9705f
review: fix non-debug tests.
PiotrSikora Nov 7, 2017
a2f657c
review: fix typo.
PiotrSikora Nov 7, 2017
0dc9b85
review: fix stack-use-after-scope.
PiotrSikora Nov 7, 2017
348e6e8
review: describe algorithm in findSslServerContext().
PiotrSikora Nov 7, 2017
7ce1bbb
review: reject configurations with mixed use of Session Ticket Keys.
PiotrSikora Nov 7, 2017
1733d92
review: fix non-debug tests (again...).
PiotrSikora Nov 7, 2017
909ada7
review: proper use of unordered_map.
PiotrSikora Nov 7, 2017
1dbe7af
review: expand comment on why Session Ticket Keys can't be mixed.
PiotrSikora Nov 7, 2017
00bf76d
review: don't init std::string.
PiotrSikora Nov 9, 2017
5c6e262
review: skip context update early on.
PiotrSikora Nov 9, 2017
ce8ccc6
review: use ctx_.get()
PiotrSikora Nov 9, 2017
0a25690
review: erase(ctx).
PiotrSikora Nov 9, 2017
c29cd5b
review: improve skip_context_update.
PiotrSikora Nov 9, 2017
37e083d
review: add TODO about making it lockless.
PiotrSikora Nov 9, 2017
7f18d00
review: add comment about skip_context_update.
PiotrSikora Nov 9, 2017
80be681
review: rename updateConnection to updateConnectionContext.
PiotrSikora Nov 9, 2017
12e1260
review: add comment about "magic" 5.
PiotrSikora Nov 9, 2017
3f21609
review: fix lock in iterateContexts().
PiotrSikora Nov 9, 2017
c11713e
review: constify few functions.
PiotrSikora Nov 9, 2017
8dc1702
review: constify iterateContexts and use shared lock there.
PiotrSikora Nov 12, 2017
fa77d69
review: rename sslContext() to defaultSslContext().
PiotrSikora Nov 12, 2017
eb3ff75
review: remove invalid domain length restrictions.
PiotrSikora Nov 12, 2017
afc3b62
review: const bool skip_context_update.
PiotrSikora Nov 14, 2017
cd854a2
review: use uint32_t instead of size_t.
PiotrSikora Nov 14, 2017
aa5098b
review: add TODO to upstream SNI parsing to BoringSSL.
PiotrSikora Nov 14, 2017
da1dbaa
review: throw exception for configs with mixed use of STK.
PiotrSikora Nov 14, 2017
4ec1786
review: add ContextManagerImpl::isWildcardServerName().
PiotrSikora Nov 14, 2017
33d27e7
review: remove "x".
PiotrSikora Nov 15, 2017
757abeb
review: make sure all filter chains are equal.
PiotrSikora Nov 15, 2017
5a2fc2a
Merge remote-tracking branch 'origin/master' into piotrsikora/sni
PiotrSikora Nov 15, 2017
ea6660a
review: fixup bad merge.
PiotrSikora Nov 15, 2017
dd3cf76
review: update data-plane-api (for docs).
PiotrSikora Nov 16, 2017
f0a48d4
review: add TODO for refactoring with RouteMatcher::findVirtualHost().
PiotrSikora Nov 16, 2017
ddbaf41
review: add and use RepeatedPtrUtil::hash().
PiotrSikora Nov 16, 2017
6901dc0
reivew: fix format.
PiotrSikora Nov 16, 2017
f519595
review: fix typo.
PiotrSikora Nov 20, 2017
a43e74c
review: constify vars.
PiotrSikora Nov 20, 2017
5700d0c
review: move function prototype before vars.
PiotrSikora Nov 20, 2017
57ed156
review: use Optional<uint64_t>.
PiotrSikora Nov 20, 2017
5520293
review: add tests for listener manager.
PiotrSikora Nov 28, 2017
4ac9fdf
Merge remote-tracking branch 'origin/master' into piotrsikora/sni.
PiotrSikora Nov 28, 2017
122f450
review: remove factory variables.
PiotrSikora Nov 28, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/configuration/listeners/stats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Every listener has a statistics tree rooted at *listener.<address>.* with the fo
ssl.handshake, Counter, Total successful TLS connection handshakes
ssl.session_reused, Counter, Total successful TLS session resumptions
ssl.no_certificate, Counter, Total successul TLS connections with no client certificate
ssl.fail_no_sni_match, Counter, Total TLS connections that were rejected because of missing SNI match
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will need to merge master and make a separate data-plane-api PR on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (pending merge in data-plane-api).

ssl.fail_verify_no_cert, Counter, Total TLS connections that failed because of missing client certificate
ssl.fail_verify_error, Counter, Total TLS connections that failed CA verification
ssl.fail_verify_san, Counter, Total TLS connections that failed SAN verification
Expand Down
15 changes: 13 additions & 2 deletions include/envoy/ssl/context_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,20 @@ class ContextManager {

/**
* Builds a ServerContext from a ServerContextConfig.
* The skip_context_update parameter is used for fast-path (avoiding lock & context looup)x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errant "x"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora You wouldn't happen to use Emacs would you? 😜

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (and no, I don't use Emacs).

* on listeners with a single filter chain and no SNI restrictions.
*/
virtual ServerContextPtr createSslServerContext(Stats::Scope& scope,
ServerContextConfig& config) PURE;
virtual ServerContextPtr createSslServerContext(const std::string& listener_name,
const std::vector<std::string>& server_names,
Stats::Scope& scope, ServerContextConfig& config,
bool skip_context_update) PURE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document the meaning/effect of skip_context_update

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


/**
* Find ServerContext for a given listener and server_name.
* @return ServerContext or nullptr in case there is no match.
*/
virtual ServerContext* findSslServerContext(const std::string& listener_name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be out of scope for this change, but should the ServerContextImpl has a reference to it's ConnectionHandler, then this function could be on ConnectionHandler and not need a listener_name param? This feels awkward because the ServerContext already exists under a listener/ConnectionHandler, so needing a separate listener lookup is strange.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed. This is related to my comments about making this lockless and adding TODO. I'm willing live with this to get this shipped, but ultimately don't like the global namespace for this. Really we want listener -> set of contexts -> global context manager. Will let the two of you sort it out.

Copy link
Contributor Author

@PiotrSikora PiotrSikora Nov 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no. While this indeed could be done much better for this specific use case, we're going to add SDS shortly and then it's going to get a bit more complex.

Personally, I'd defer any optimizations and/or refactoring until SDS is committed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with that.

const std::string& server_name) PURE;

/**
* @return the number of days until the next certificate being managed will expire.
Expand Down
1 change: 1 addition & 0 deletions source/common/ssl/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ envoy_cc_library(
"//include/envoy/stats:stats_interface",
"//include/envoy/stats:stats_macros",
"//source/common/common:assert_lib",
"//source/common/common:empty_string",
"//source/common/common:hex_lib",
],
)
91 changes: 83 additions & 8 deletions source/common/ssl/context_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ int ContextImpl::sslContextIndex() {
}

ContextImpl::ContextImpl(ContextManagerImpl& parent, Stats::Scope& scope, ContextConfig& config)
: parent_(parent), ctx_(SSL_CTX_new(TLS_method())), scope_(scope),
stats_(generateStats(scope)) {
: parent_(parent), ctx_(SSL_CTX_new(TLS_method())), scope_(scope), stats_(generateStats(scope)),
ecdh_curves_(config.ecdhCurves()) {
RELEASE_ASSERT(ctx_);

int rc = SSL_CTX_set_ex_data(ctx_.get(), sslContextIndex(), this);
Expand All @@ -40,8 +40,8 @@ ContextImpl::ContextImpl(ContextManagerImpl& parent, Stats::Scope& scope, Contex
fmt::format("Failed to initialize cipher suites {}", config.cipherSuites()));
}

if (!SSL_CTX_set1_curves_list(ctx_.get(), config.ecdhCurves().c_str())) {
throw EnvoyException(fmt::format("Failed to initialize ECDH curves {}", config.ecdhCurves()));
if (!SSL_CTX_set1_curves_list(ctx_.get(), ecdh_curves_.c_str())) {
throw EnvoyException(fmt::format("Failed to initialize ECDH curves {}", ecdh_curves_));
}

int verify_mode = SSL_VERIFY_NONE;
Expand Down Expand Up @@ -355,10 +355,20 @@ bssl::UniquePtr<SSL> ClientContextImpl::newSsl() const {
return ssl_con;
}

ServerContextImpl::ServerContextImpl(ContextManagerImpl& parent, Stats::Scope& scope,
ServerContextConfig& config, Runtime::Loader& runtime)
: ContextImpl(parent, scope, config), runtime_(runtime),
ServerContextImpl::ServerContextImpl(ContextManagerImpl& parent, const std::string& listener_name,
const std::vector<std::string>& server_names,
Stats::Scope& scope, ServerContextConfig& config,
bool skip_context_update, Runtime::Loader& runtime)
: ContextImpl(parent, scope, config), listener_name_(listener_name),
server_names_(server_names), skip_context_update_(skip_context_update), runtime_(runtime),
session_ticket_keys_(config.sessionTicketKeys()) {
SSL_CTX_set_select_certificate_cb(
ctx_.get(), [](const SSL_CLIENT_HELLO* client_hello) -> ssl_select_cert_result_t {
ContextImpl* context_impl = static_cast<ContextImpl*>(
SSL_CTX_get_ex_data(SSL_get_SSL_CTX(client_hello->ssl), sslContextIndex()));
return dynamic_cast<ServerContextImpl*>(context_impl)->processClientHello(client_hello);
});

if (!config.caCertFile().empty()) {
bssl::UniquePtr<STACK_OF(X509_NAME)> list(SSL_load_client_CA_file(config.caCertFile().c_str()));
if (nullptr == list) {
Expand Down Expand Up @@ -404,7 +414,7 @@ ServerContextImpl::ServerContextImpl(ContextManagerImpl& parent, Stats::Scope& s
int rc = EVP_DigestInit(&md, EVP_sha256());
RELEASE_ASSERT(rc == 1);

// Hash the CommonName/SANs in the server certificate. This makes sure that
// Hash the CommonName/SANs of the server certificate. This makes sure that
// sessions can only be resumed to a certificate for the same name, but allows
// resuming to unique certs in the case that different Envoy instances each have
// their own certs.
Expand Down Expand Up @@ -469,12 +479,77 @@ ServerContextImpl::ServerContextImpl(ContextManagerImpl& parent, Stats::Scope& s
RELEASE_ASSERT(rc == 1);
}

// Hash configured SNIs for this context, so that sessions cannot be resumed across different
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't totally follow this. Should this be "so that sessions cannot be resumed across listeners with different configurations?" What is the downside of allowing resumption if the SNI cert matches?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, except s/listeners/filter chains/.

Imagine certificate for *.example.com that's used on different filter chains (e.g. www.example.com and admin.example.com). There is no reason sessions on those logically separate "virtual hosts" should be resumable simply because they use the same wildcard certificate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe we are using different lingo or I'm just confused. I think by far the most common use of SNI is going to be a single filter chain with multiple certificates, where host/authority header allows the virtual host in the route table to be selected. Very few people would regularly configure a full HTTP connection manager per SNI host I think. AFAICT the below check basically says that we don't allow resumption unless the filter chain has all the same server names that it had before. This seems wrong to me?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, but that's not what the data-plane-api allows you to do, really.

Per current (and frozen) specification, sni_domains are part of the FilterChainMatch, which is used to select FilterChain. Each FilterChain contains a single DownstreamTlsContext, therefore multiple DownstreamTlsContext require multiple filter chains, and there is no other SNI to TLS certificate mapping.

cc @htuch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see. I think I totally missed that part of how we did the protos. I'm concerned users will find that burdensome since it will potentially require a different route/RDS table for each sever name. I now understand why you are calling this a hack w/o full filter chain matching support.

My concern here is that once this code goes out we are actually providing the behavior that I originally thought people would want (multiple server names -> 1 connection manager). Can we really take it away once we give it to them? At minimum we would have to put it through the full deprecation cycle.

This is a tough one. I'm wondering if it's worth it for me to help out and try to get the multiple filter chains working?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora I talked offline with @htuch and he had the idea that to help with back-compat, we should check that all the filter chains are identical and throw if they aren't. Then in the future we can relax this. I think that will make me OK with this.

I also now think I understand why the multiple filter chain stuff is complicated. You basically have to complete the handshake before you do any filter chain processing/selection. I can have a look at all of this offline and come back with some implementation notes for the future if that would be useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I was planning to use multiple filter chains heavily. That decision isn't set in stone, but it's I was planning to model my configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggreenway we are going to implement multiple filter chains, it's just a matter of when. It's pretty urgent that we get base SNI support merged so I think this approach is fine to start with as long as we future proof it. Then I will likely look at the multiple filter chain aspect after this merges.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

// filter chains, even when using the same server certificate.
for (const auto& name : server_names_) {
rc = EVP_DigestUpdate(&md, name.data(), name.size());
RELEASE_ASSERT(rc == 1);
}

rc = EVP_DigestFinal(&md, session_context_buf, &session_context_len);
RELEASE_ASSERT(rc == 1);
rc = SSL_CTX_set_session_id_context(ctx_.get(), session_context_buf, session_context_len);
RELEASE_ASSERT(rc == 1);
}

ssl_select_cert_result_t
ServerContextImpl::processClientHello(const SSL_CLIENT_HELLO* client_hello) {
if (skip_context_update_) {
return ssl_select_cert_success;
}

std::string server_name;
const uint8_t* data;
size_t len;

if (SSL_early_callback_ctx_extension_get(client_hello, TLSEXT_TYPE_server_name, &data, &len)) {
// Based on BoringSSL's ext_sni_parse_clienthello().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we can't use the boring ssl function here? Is it worth putting in a TODO here to replace with a boringssl helper? cc @davidben

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few reasons:

  1. ext_sni_parse_clienthello() isn't exported and early callback (which we need to be able to set different protocol versions across different filter chains) is called so early that SNI isn't parsed yet and it's not available via SSL_set_tlsext_host_name().
  2. I'm going to need a few more changes for ClientHello sniffing (for SNI-based routing without TLS termination #1843), so I'm going to try to upstream everything that's needed once we have a complete picture.
  3. We're using chromium-stable branch of BoringSSL, so anything we add to BoringSSL wouldn't be available to use for the next month or two anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK SGTM. Maybe just add a TODO to circle back and upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// Match on empty SNI instead of rejecting connection in case we cannot process the extension.
CBS extension;
CBS_init(&extension, data, len);
CBS server_name_list, host_name;
uint8_t name_type;
if (CBS_get_u16_length_prefixed(&extension, &server_name_list) &&
CBS_get_u8(&server_name_list, &name_type) &&
CBS_get_u16_length_prefixed(&server_name_list, &host_name) &&
CBS_len(&server_name_list) == 0 && CBS_len(&extension) == 0 &&
name_type == TLSEXT_NAMETYPE_host_name && CBS_len(&host_name) != 0 &&
CBS_len(&host_name) <= TLSEXT_MAXLEN_host_name && !CBS_contains_zero_byte(&host_name)) {
server_name.assign(reinterpret_cast<const char*>(CBS_data(&host_name)), CBS_len(&host_name));
}
}

ServerContext* new_ctx = parent_.findSslServerContext(listener_name_, server_name);

// Reject connection if we didn't find a match.
if (new_ctx == nullptr) {
stats_.fail_no_sni_match_.inc();
return ssl_select_cert_error;
}

// Update context if it changed.
if (new_ctx != this) {
ServerContextImpl* new_impl = dynamic_cast<ServerContextImpl*>(new_ctx);
new_impl->updateConnectionContext(client_hello->ssl);
}

return ssl_select_cert_success;
}

void ServerContextImpl::updateConnectionContext(SSL* ssl) {
ASSERT(ctx_);

SSL_set_SSL_CTX(ssl, ctx_.get());
ASSERT(SSL_CTX_get_ex_data(ctx_.get(), sslContextIndex()) == this);

// Update SSL-level settings and parameters that are inherited from SSL_CTX during SSL_new().
SSL_set_verify(ssl, SSL_CTX_get_verify_mode(ctx_.get()), SSL_CTX_get_verify_callback(ctx_.get()));

int rc = SSL_set1_curves_list(ssl, ecdh_curves_.c_str());
ASSERT(rc == 1);
UNREFERENCED_PARAMETER(rc);
}

int ServerContextImpl::sessionTicketProcess(SSL*, uint8_t* key_name, uint8_t* iv,
EVP_CIPHER_CTX* ctx, HMAC_CTX* hmac_ctx, int encrypt) {
const EVP_MD* hmac = EVP_sha256();
Expand Down
20 changes: 15 additions & 5 deletions source/common/ssl/context_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ namespace Ssl {
COUNTER(handshake) \
COUNTER(session_reused) \
COUNTER(no_certificate) \
COUNTER(fail_no_sni_match) \
COUNTER(fail_verify_no_cert) \
COUNTER(fail_verify_error) \
COUNTER(fail_verify_san) \
Expand All @@ -42,8 +43,6 @@ struct SslStats {

class ContextImpl : public virtual Context {
public:
~ContextImpl() { parent_.releaseContext(this); }

virtual bssl::UniquePtr<SSL> newSsl() const;

/**
Expand Down Expand Up @@ -119,11 +118,13 @@ class ContextImpl : public virtual Context {
bssl::UniquePtr<X509> cert_chain_;
std::string ca_file_path_;
std::string cert_chain_file_path_;
const std::string ecdh_curves_;
};

class ClientContextImpl : public ContextImpl, public ClientContext {
public:
ClientContextImpl(ContextManagerImpl& parent, Stats::Scope& scope, ClientContextConfig& config);
~ClientContextImpl() { parent_.releaseClientContext(this); }

bssl::UniquePtr<SSL> newSsl() const override;

Expand All @@ -133,19 +134,28 @@ class ClientContextImpl : public ContextImpl, public ClientContext {

class ServerContextImpl : public ContextImpl, public ServerContext {
public:
ServerContextImpl(ContextManagerImpl& parent, Stats::Scope& scope, ServerContextConfig& config,
ServerContextImpl(ContextManagerImpl& parent, const std::string& listener_name,
const std::vector<std::string>& server_names, Stats::Scope& scope,
ServerContextConfig& config, bool skip_context_update,
Runtime::Loader& runtime);
~ServerContextImpl() { parent_.releaseServerContext(this, listener_name_, server_names_); }

private:
ssl_select_cert_result_t processClientHello(const SSL_CLIENT_HELLO* client_hello);
void updateConnectionContext(SSL* ssl);

int alpnSelectCallback(const unsigned char** out, unsigned char* outlen, const unsigned char* in,
unsigned int inlen);
int sessionTicketProcess(SSL* ssl, uint8_t* key_name, uint8_t* iv, EVP_CIPHER_CTX* ctx,
HMAC_CTX* hmac_ctx, int encrypt);

const std::string listener_name_;
const std::vector<std::string> server_names_;
const bool skip_context_update_;
Runtime::Loader& runtime_;
std::vector<uint8_t> parsed_alt_alpn_protocols_;
const std::vector<ServerContextConfig::SessionTicketKey> session_ticket_keys_;
};

} // Ssl
} // Envoy
} // namespace Ssl
} // namespace Envoy
114 changes: 103 additions & 11 deletions source/common/ssl/context_manager_impl.cc
Original file line number Diff line number Diff line change
@@ -1,18 +1,54 @@
#include "common/ssl/context_manager_impl.h"

#include <functional>
#include <mutex>
#include <shared_mutex>

#include "common/common/assert.h"
#include "common/common/empty_string.h"
#include "common/ssl/context_impl.h"

namespace Envoy {
namespace Ssl {

ContextManagerImpl::~ContextManagerImpl() { ASSERT(contexts_.empty()); }

void ContextManagerImpl::releaseContext(Context* context) {
std::unique_lock<std::mutex> lock(contexts_lock_);
void ContextManagerImpl::releaseClientContext(ClientContext* context) {
std::unique_lock<std::shared_timed_mutex> lock(contexts_lock_);

// context may not be found, in the case that a subclass of Context throws
// in it's constructor. In that case the context did not get added, but
// the destructor of Context will run and call releaseContext().
contexts_.remove(context);
}

void ContextManagerImpl::releaseServerContext(ServerContext* context,
const std::string& listener_name,
const std::vector<std::string>& server_names) {
std::unique_lock<std::shared_timed_mutex> lock(contexts_lock_);

// Remove mappings.
auto& listener_map_exact = map_exact_[listener_name];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I still think factoring out the map handler into a utility class would be easier to understand and simpler to test in multiple permutations. Up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

if (server_names.empty()) {
auto ctx = listener_map_exact.find(EMPTY_STRING);
if (ctx != listener_map_exact.end() && ctx->second == context) {
listener_map_exact.erase(ctx);
}
} else {
auto& listener_map_wildcard = map_wildcard_[listener_name];
for (const auto& name : server_names) {
if (name.size() > 2 && name[0] == '*' && name[1] == '.') {
auto ctx = listener_map_wildcard.find(name);
if (ctx != listener_map_wildcard.end() && ctx->second == context) {
listener_map_wildcard.erase(ctx);
}
} else {
auto ctx = listener_map_exact.find(name);
if (ctx != listener_map_exact.end() && ctx->second == context) {
listener_map_exact.erase(ctx);
}
}
}
}

// context may not be found, in the case that a subclass of Context throws
// in it's constructor. In that case the context did not get added, but
Expand All @@ -22,23 +58,79 @@ void ContextManagerImpl::releaseContext(Context* context) {

ClientContextPtr ContextManagerImpl::createSslClientContext(Stats::Scope& scope,
ClientContextConfig& config) {

ClientContextPtr context(new ClientContextImpl(*this, scope, config));
std::unique_lock<std::mutex> lock(contexts_lock_);
std::unique_lock<std::shared_timed_mutex> lock(contexts_lock_);
contexts_.emplace_back(context.get());
return context;
}

ServerContextPtr ContextManagerImpl::createSslServerContext(Stats::Scope& scope,
ServerContextConfig& config) {
ServerContextPtr context(new ServerContextImpl(*this, scope, config, runtime_));
std::unique_lock<std::mutex> lock(contexts_lock_);
ServerContextPtr ContextManagerImpl::createSslServerContext(
const std::string& listener_name, const std::vector<std::string>& server_names,
Stats::Scope& scope, ServerContextConfig& config, bool skip_context_update) {
ServerContextPtr context(new ServerContextImpl(*this, listener_name, server_names, scope, config,
skip_context_update, runtime_));
std::unique_lock<std::shared_timed_mutex> lock(contexts_lock_);
contexts_.emplace_back(context.get());

// Save mappings.
if (server_names.empty()) {
map_exact_[listener_name][EMPTY_STRING] = context.get();
} else {
for (const auto& name : server_names) {
if (name.size() > 2 && name[0] == '*' && name[1] == '.') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This non-trivial condition appears in multiple places. Make it a helper?
bool isWildcardServerName(const std::string& name)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, but it feels like over-engineering to me.

map_wildcard_[listener_name][name] = context.get();
} else {
map_exact_[listener_name][name] = context.get();
}
}
}

return context;
}

ServerContext* ContextManagerImpl::findSslServerContext(const std::string& listener_name,
const std::string& server_name) {
// Find Ssl::ServerContext to use. The algorithm for "www.example.com" is as follows:
// 1. Try exact match on domain, i.e. "www.example.com"
// 2. Try exact match on wildcard, i.e. "*.example.com"
// 3. Try "no SNI" match, i.e. ""
// 4. Return no context and reject connection.

// TODO(PiotrSikora): make this lockless.
std::shared_lock<std::shared_timed_mutex> lock(contexts_lock_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you throw a TODO here (attributed to me if you want) about removing locking from this function. With R/W lock it's fine but would prefer to ultimately make it lockless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I think we could fix that by using std::shared_ptr, but I didn't want to block this PR on it. Also, keep in mind that whatever we're going to do now, it's most likely going to change with SDS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure we don't mess up the reader-writer lock semantics, can this function be const? Or can we only access the member variables through a const-ref? Then the compiler will error if anything non-const is accidentally done. Or maybe put most of the logic into a static member that takes the maps by const-ref parameter.

auto& listener_map_exact = map_exact_[listener_name];
auto ctx = listener_map_exact.find(server_name);
if (ctx != listener_map_exact.end()) {
return ctx->second;
}

// Try to construct and match wildcard domain.
// Theoretically, 5 is the minimum legal domain length for the wildcard certificate (i.e. *.a.b).
if (server_name.size() >= 5) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't "*.a" be a valid wildcard domain? Either for a gTLD or for something internal to a company? I think we can just drop this condition. The common case will either be something >5 or empty string, and empty string will quickly bail out when trying to find('.'), so I don't think the performance will matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, that's indeed valid wildcard when using internal CAs. I removed this restriction.

size_t pos = server_name.find('.');
if (pos > 0) {
Copy link
Contributor

@ggreenway ggreenway Nov 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (pos != std::string::npos && pos != 0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewritten to if (pos > 0 && pos < server_name.size() - 1) { (because of change in restrictions). I don't believe that we need pos != std::string::npos anymore, but feel free to correct me, since I'm clearly not a C++ expert.

size_t rpos = server_name.rfind('.');
if (rpos > pos + 1 && rpos != server_name.size() - 1) {
std::string wildcard = '*' + server_name.substr(pos);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: any reason to not use the wildcard matching algorithm that we have for virtual hosts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to reusing the matching code in config_impl. Can we factor that out into a utility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which code do you mean, exactly? ContextImpl::dNSNameMatch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source/common/router/config_impl.h, RouteMatcher does similar work. See "domains" in https://www.envoyproxy.io/envoy/configuration/http_conn_man/route_config/vhost.html#config-http-conn-man-route-table-vhost. Without a strong argument against, I think it makes sense for the matching semantics to be the same between this code and that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd +1 as well for the reason that we might change the implementation to be smarter one day, e.g. trie-based as per @tschroed 's earlier implementation. Probably not needed right now, but just as future proofing.

auto& listener_map_wildcard = map_wildcard_[listener_name];
auto ctx = listener_map_wildcard.find(wildcard);
if (ctx != listener_map_wildcard.end()) {
return ctx->second;
}
}
}
}

ctx = listener_map_exact.find(EMPTY_STRING);
if (ctx != listener_map_exact.end()) {
return ctx->second;
}

return nullptr;
}

size_t ContextManagerImpl::daysUntilFirstCertExpires() {
std::unique_lock<std::mutex> lock(contexts_lock_);
std::shared_lock<std::shared_timed_mutex> lock(contexts_lock_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about function const-ness as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

size_t ret = std::numeric_limits<int>::max();
for (Context* context : contexts_) {
ret = std::min<size_t>(context->daysUntilFirstCertExpires(), ret);
Expand All @@ -47,7 +139,7 @@ size_t ContextManagerImpl::daysUntilFirstCertExpires() {
}

void ContextManagerImpl::iterateContexts(std::function<void(Context&)> callback) {
std::unique_lock<std::mutex> lock(contexts_lock_);
std::unique_lock<std::shared_timed_mutex> lock(contexts_lock_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this a shared lock? Is it because the callback takes a non-const Context? I think you could fix that and make the iterate callback take a const Context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked to make it non-shared because of the non-const callback param. If all callbacks passed to this can use a const Context& then the type should be changed and this can be a shared lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for (Context* context : contexts_) {
callback(*context);
}
Expand Down
Loading