Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssl: add support for SNI. #1984

Merged
merged 53 commits into from
Nov 28, 2017

Conversation

PiotrSikora
Copy link
Contributor

This version serves different TLS certificates based on SNI,
but it doesn't select alternative filter chains.

Partially fixes #95.

This version serves different TLS certificates based on SNI,
but it doesn't select alternative filter chains.

Partially fixes envoyproxy#95.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
@PiotrSikora
Copy link
Contributor Author

Tests to follow...

cc @mattklein123 @rshriram

@rshriram
Copy link
Member

rshriram commented Nov 1, 2017

yay!

if (pos > 0) {
size_t rpos = server_name.rfind('.');
if (rpos > pos + 1 && rpos != server_name.size() - 1) {
std::string wildcard = '*' + server_name.substr(pos);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: any reason to not use the wildcard matching algorithm that we have for virtual hosts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to reusing the matching code in config_impl. Can we factor that out into a utility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which code do you mean, exactly? ContextImpl::dNSNameMatch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source/common/router/config_impl.h, RouteMatcher does similar work. See "domains" in https://www.envoyproxy.io/envoy/configuration/http_conn_man/route_config/vhost.html#config-http-conn-man-route-table-vhost. Without a strong argument against, I think it makes sense for the matching semantics to be the same between this code and that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd +1 as well for the reason that we might change the implementation to be smarter one day, e.g. trie-based as per @tschroed 's earlier implementation. Probably not needed right now, but just as future proofing.

}
}

return map_exact_[listener_name][EMPTY_STRING];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is never going to return nullptr right? But the comment says it might return nullptr if not found. Is this doing the fall back thing that we discussed a few days ago?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can, e.g. if all available filter chains have sni_domains configured, but none of them match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my confusion lies in the assumption that all entries of map_exact_ are non null, which does not seem to be the case.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @PiotrSikora! Extremely awesome to see this about to land. Few comments to get started.

size_t len;

if (SSL_early_callback_ctx_extension_get(client_hello, TLSEXT_TYPE_server_name, &data, &len)) {
/* Based on BoringSSL's ext_sni_parse_clienthello(). Match on empty SNI if we encounter any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: // style comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

void ServerContextImpl::updateConnection(SSL* ssl) {
RELEASE_ASSERT(ctx_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would just do ASSERT here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (pos > 0) {
size_t rpos = server_name.rfind('.');
if (rpos > pos + 1 && rpos != server_name.size() - 1) {
std::string wildcard = '*' + server_name.substr(pos);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to reusing the matching code in config_impl. Can we factor that out into a utility?

return context;
}

ServerContext* ContextManagerImpl::findSslServerContext(const std::string& listener_name,
const std::string& server_name) {
std::unique_lock<std::mutex> lock(contexts_lock_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very non-optimal that we are acquiring a lock here on every handshake, especially regardless of whether SNI is being used or not. In the interest of moving this along, I'm OK living with this for now. At minimum please put in a TODO. The quick fix is that we can make this a shared mutex and only acquire read access when we do a find. We are going to move to C++14 soon and that will become available. I would prefer that this be completely lockless within a single listener, and I can think of ways of doing this, but it would be a larger change. If you want to brainstorm on this let's chat offline or we can just leave as part of the TODO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the uninitiated, why do we need locks here? is this shared global context across all threads, where the context can be updated by LDS or future SDS updates? Is that the reason for suggesting a r/w lock?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock is required because we print the contexts as part of admin output, and contexts come and go with listeners. C++14 support has merged so this should be changed to a shared_mutex (I think C++14 only has the timed one, but it works in not timed mode). We can fix the need for a lock in this flow at all later and leave a TODO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ServerContext* ContextManagerImpl::findSslServerContext(const std::string& listener_name,
const std::string& server_name) {
std::unique_lock<std::mutex> lock(contexts_lock_);
if (map_exact_[listener_name][server_name] != nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case that server_name wasn't in the inner map, this inserts it, based on the user-provided SNI value in the handshake. I think you need to use map_exact[listener_name].find(server_name) != end, and assume there are never nullptr's in the map. Similar change needed in other places map_exact_ is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

size_t rpos = server_name.rfind('.');
if (rpos > pos + 1 && rpos != server_name.size() - 1) {
std::string wildcard = '*' + server_name.substr(pos);
if (map_wildcard_[listener_name][wildcard] != nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above re: inserting into the map

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
@rshriram
Copy link
Member

rshriram commented Nov 7, 2017 via email

@PiotrSikora
Copy link
Contributor Author

@rshriram RouteMatcher::findWildcardVirtualHost() is quite different from what I do here (longest match vs exact match).

@rshriram
Copy link
Member

rshriram commented Nov 7, 2017

A lengthy and detailed comment would help a lot. Like https://github.com/envoyproxy/envoy/blob/master/source/common/router/config_impl.cc#L661
Especially what is the intended input/how people would setup wildcard certs, and how we match, the semantics, etc.

Found by ASan.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
@@ -86,7 +86,7 @@ ServerContextPtr ContextManagerImpl::createSslServerContext(
ServerContext* ContextManagerImpl::findSslServerContext(const std::string& listener_name,
const std::string& server_name) {
std::shared_lock<std::shared_timed_mutex> lock(contexts_lock_);
if (map_exact_[listener_name][server_name] != nullptr) {
if (map_exact_[listener_name].find(server_name) != map_exact_[listener_name].end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is functionally correct, but it uses 5 hash table lookups when it only needs to do two. Should be:

auto& listener_map = map_exact_[listener_name];
auto server_name_it = listener_map.find(server_name);
if (server_name_it != listener_map.end()) {
return *server_name_it;
}

Same in other places using this map.

}

filter_factories_ = parent_.factory_.createFilterFactoryList(filter_chain.filters(), *this);
// TODO(PiotrSikora): allow filter chains with mixed use of Session Ticket Keys.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment on why this won't work as is? I thought this would have worked as-is.

Copy link
Contributor Author

@PiotrSikora PiotrSikora Nov 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Note that there are few ways to make it work, but I don't want to block this PR on that.

// Remove mappings.
if (server_names.empty()) {
if (map_exact_[listener_name][EMPTY_STRING] == context) {
map_exact_[listener_name][EMPTY_STRING] = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting to nullptr doesn't remove from the map. Need one of the variants of erase: http://en.cppreference.com/w/cpp/container/unordered_map/erase

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh, I know that... Good catch, fixed, thanks!

@mattklein123
Copy link
Member

RouteMatcher::findWildcardVirtualHost() is quite different from what I do here (longest match vs exact match).

@PiotrSikora quick drive by: you might consider breaking the matching stuff out into a separate utility class. I think it would be easier to unit test various matching cases.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few small things then ready to 🚢

@@ -24,19 +24,30 @@ class ContextManager {

/**
* Builds a ServerContext from a ServerContextConfig.
* The skip_context_update parameter is used for fast-path (avoiding lock & context looup)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "looup"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

std::shared_lock<std::shared_timed_mutex> lock(contexts_lock_);

// TODO(PiotrSikora): refactor and combine code with RouteMatcher::findVirtualHost().
auto listener_map_exact = map_exact_.find(listener_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think most of the vars in this function can be const (including the auto ones).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

std::unordered_map<std::string, std::unordered_map<std::string, ServerContext*>> map_exact_;
std::unordered_map<std::string, std::unordered_map<std::string, ServerContext*>> map_wildcard_;

static bool isWildcardServerName(const std::string& name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: functions go above variables (move to line 48 followed by newline)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

(config.filter_chains().size() == 1 &&
config.filter_chains()[0].filter_chain_match().sni_domains().empty());

size_t filters_hash = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically, the hash can be 0. I would use Optional<uint64_t> for this variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

filters_hash = RepeatedPtrUtil::hash(filter_chain.filters());
filter_factories_ = parent_.factory_.createFilterFactoryList(filter_chain.filters(), *this);
} else if (filters_hash != RepeatedPtrUtil::hash(filter_chain.filters())) {
throw EnvoyException(fmt::format("error adding listener '{}': use of different filter chains "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would say "use of SNI and different filter chains..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, existing code allows multiple filter chains, as long as they have the same filters, regardless of use of SNI, so I think that existing text is a bit more correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK that's fine.

// callback, which is going to be called iff it's set on the initial SSL_CTX, even if it's not
// set on the current SSL_CTX that doesn't have any Session Ticket Keys configured.
if (has_stk != 0 && has_stk != has_tls) {
throw EnvoyException(fmt::format("error adding listener '{}': filter chains with mixed use of "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missed it but I don't see tests for these error cases. Can we add some listener_manager_impl tests that EXPECT_THROW_WITH_MESSAGE on bad configs for here and the not matching filter chains?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm working on those, but I'm on vacations and traveling this week, so I don't have any ETA)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I'm not sure if it's also worth it to have a real integration test which would exercise the happy path of the listener manager code. It's up to you but definitely would like to get some coverage on the listener manager and the error cases. Thank you for working on this! So close!

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
@mattklein123
Copy link
Member

@PiotrSikora also you probably saw but FYI needs a master merge also.

@Morriz
Copy link

Morriz commented Nov 23, 2017

so close! yess! wringing hands

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Small question.

@@ -749,5 +750,237 @@ TEST_F(ListenerManagerImplTest, EarlyShutdown) {
manager_->stopWorkers();
}

TEST_F(ListenerManagerImplWithRealFiltersTest, SniWithSingleFilterChain) {
Server::Configuration::HttpConnectionManagerFilterConfigFactory factory;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are those factory local variables needed? AFAICT they are not used?

Copy link
Contributor Author

@PiotrSikora PiotrSikora Nov 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to auto-register envoy.http_connection_manager, otherwise loading of the configuration will fail because of an unknown filter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I think the issue is not this variable, but just linking against the translation unit that has the config factory in it. (Since that is where static registration happens). I don't think this variable actually does anything. If you leave the dep and remove the variable and include does it pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, linking against it is enough to fix that. Thanks!

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
@mattklein123 mattklein123 merged commit c26b25e into envoyproxy:master Nov 28, 2017
SEJeff added a commit to SEJeff/contour that referenced this pull request Dec 10, 2017
SEJeff added a commit to SEJeff/contour that referenced this pull request Dec 10, 2017
SEJeff added a commit to SEJeff/contour that referenced this pull request Dec 10, 2017
envoyproxy/envoy#1984

Signed-off-by: Jeff Schroeder <jeffschroeder@computer.org>
@LuyaoZhong
Copy link
Contributor

@PiotrSikora @mattklein123

Current Envoy does not support selecting certificate based on SNI. Why do we remove this functionality which was done 5 years ago?

I'm working on a feature called TLS bumping, in which scenario there might have multiple certs for different SNI inside one transport socket.

@ggreenway
Copy link
Contributor

@LuyaoZhong Envoy does support selecting a certificate based on SNI, it's just done by selecting a filter-chain. No functionality was removed; it was just changed.

@LuyaoZhong
Copy link
Contributor

LuyaoZhong commented Jun 14, 2022

@ggreenway

Got it. But I still would like to support tls-transport-layer cert selecting based on SNI.

Consider TLS bumping, its initial design has been accepted by @mattklein123. We might have a SNI list (*.google.com, *.cnn.com, etc.) for bumping, which means we would like to inspect the traffic to these websites. We will implement on-demand mimic certificates generating during runtime and attaching them to one single transport socket to do TLS handshake, in this case we need to support cert selecting inside one transport socket similar as this PR did.

@mattklein123
Copy link
Member

In general there are cases in which the filter chain based selection model does not work well (massive number of certs, etc.). I think it's reasonable to do a custom plugin for cert selection (I thought we already support this extension point but I haven't looked in a while.)

@ggreenway
Copy link
Contributor

Agreed that it makes sense, but let's move discussion to the relevant issue, not a PR from 2017.

jpsim pushed a commit that referenced this pull request Nov 28, 2022
Description: add EngineBuilder API to filter unroutable families
Risk Level: low - opt in API
Testing: upstream tests.
Docs Changes: updated

Signed-off-by: Jose Nino <jnino@lyft.com>
Signed-off-by: JP Simard <jp@jpsim.com>
jpsim pushed a commit that referenced this pull request Nov 29, 2022
Description: add EngineBuilder API to filter unroutable families
Risk Level: low - opt in API
Testing: upstream tests.
Docs Changes: updated

Signed-off-by: Jose Nino <jnino@lyft.com>
Signed-off-by: JP Simard <jp@jpsim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TLS: Server side SNI
8 participants