Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to dynamically update TLS certificates #5228

Open
wants to merge 39 commits into
base: main
Choose a base branch
from

Conversation

my4-dev
Copy link
Contributor

@my4-dev my4-dev commented Oct 9, 2023

Motivation:

#5033

API design note:

  • TlsKeyPair represents a pair of PrivateKey and X509Certificate chain.
    • This API is used as an official key pair type in Armeria.
    • All APIs for specifying key pairs in TlsSetters have been deprecated in favor of TlsSetters tls(TlsKeyPair).
      TlsKeyPair.of(privateKey, certificate);
      TlsKeyPair.of(privateKey, keyPassword, certificate);
  • TlsProvider dynamically resolves a TlsKeyPair for the given hostname when a connection is established.
    • DNS wildcard format is supported as a hostname.
    • * is used as a special hostname to get the TlsKeyPair for the default virtual host.
      TlsProvider
         .builderForServer()
          // Set the default key pair.
         .setDefault(TlsKeyPair.of(...))
         // Set the key pair for "*.example.com".
         .set("*.example.com", TlsKeyPair.of(...))
         .build();
    • To dynamically update/reload TlsKeyPair, a custom TlsProvider can be implemented.
      class DynamicTlsProvider implements TlsProvider {
         @Override
         public TlsKeyPair find(String hostname) {
             // relodableCache will be updated periodically by a scheduler
             return relodableCache.get(hostname);
         }
      }
      • The newly returned key pair is used for the TLS handshake of new connections.
  • ServerTlsConfig and ClientTlsConfig are added to override the default values and customize SslContextBuilder.
    • Unlike TlsProvider, *TlsConfig are immutable so all TlsKeyPairs returned by a TlsProvider build SslContext with the same configuration.
  • Both server and client allow TlsProvider and TlsKeyPair for TLS configurations.
    Server
      .builder()
      // For dynamic usage
      .tlsProvider(tlsProvider)
      // For customizing TLS
      .tlsProvider(tlsProvider, serverTlsConfig)
      // For sample usage
      .tls(tlsKeyPair)
      .build()
    
    ClientFactory
      .builder()
      // For dynamic usage
      .tlsProvider(tlsProvider)
      // For customizing TLS
      .tlsProvider(tlsProvider, clientTlsConfig)
      // For sample usage
      .tls(tlsKeyPair)
      .build()
    • Some internal implementations for TLS handshake have been changed to create SslContext dynamically.

Modifications:

  • Server
    • Add TlsProviderMapping that converts TlsProvider into SslContext Mapping for SniHandler.
      • A dynamic TlsProvider can be used to update the certificates without Server.reconfigure().
    • Add a setter method for TlsProvider to ServerBuilder.
      • A builder method for VirtualHost isn't added because a TlsProvider can contain multiple certificates.
        • If necessary, I will consider TlsProvider at the virtual host level later.
  • Client
    • Fix Bootstraps to create a Bootstrap with a TlsKeyPair returned by TlsProvider when a new connection is created.
      • If no TlsProvider is set, the original behavior that returns predefined BootStraap is used.
    • Add options for TlsProvider to ClientFactoryBuilder.
  • Common
    • TlsProvider provides separate builders for the client and server.
    • TlsKeyPair provides various factory methods to easily create a key pair from different resources.
    • Cache SslContexts and expire them after 1 hour of inactivity.
      • If you think that the caching strategy will not be effective, I am willing to revert it.
    • Add CloseableMeterBinder to unregister when the associated resource is unused.
  • Deprecate) The following APIs have been deprecated:
    • TlsSetters tls(File keyCertChainFile, File keyFile)
    • TlsSetters tls(File keyCertChainFile, File keyFile, @Nullable String keyPassword)
    • TlsSetters tls(InputStream keyCertChainInputStream, InputStream keyInputStream)
    • TlsSetters tls(InputStream keyCertChainInputStream, InputStream keyInputStream, @Nullable String keyPassword)
    • TlsSetters tls(PrivateKey key, X509Certificate... keyCertChain)
    • TlsSetters tls(PrivateKey key, Iterable<? extends X509Certificate> keyCertChain)
    • TlsSetters tls(PrivateKey key, @Nullable String keyPassword, X509Certificate... keyCertChain)
    • TlsSetters tls(PrivateKey key, @Nullable String keyPassword, Iterable<? extends X509Certificate> keyCertChain)

Result:

@my4-dev
Copy link
Contributor Author

my4-dev commented Oct 9, 2023

Hi, @ikhoon !
This PR is just a poc and incomplete.
Please confirm that this implementation is appropriate to solve this Issue.
(This is a little different from your suggestion)

@minwoox minwoox added this to the 1.27.0 milestone Oct 11, 2023
@ikhoon
Copy link
Contributor

ikhoon commented Oct 31, 2023

The priority of this issue has been raised. Please let me finish this PR. 🙇‍♂️

@my4-dev
Copy link
Contributor Author

my4-dev commented Oct 31, 2023

I'm sorry to be late for this reply.
I was so busy that I couldn't confirm this PR🙇‍♂️

@ikhoon
Copy link
Contributor

ikhoon commented Oct 31, 2023

No worries. I’m really grateful for your contribution. 🙇‍♂️

@ikhoon ikhoon changed the title Specify a client TLS key pair without ClientFactory Provide a way to dynamically update TLS certificates Dec 20, 2023
Copy link

codecov bot commented Jan 3, 2024

Codecov Report

Attention: Patch coverage is 65.77017% with 140 lines in your changes missing coverage. Please review.

Project coverage is 73.62%. Comparing base (e08527d) to head (cb798b2).
Report is 7 commits behind head on main.

Current head cb798b2 differs from pull request most recent head be0752d

Please upload reports for the commit be0752d to get more accurate results.

Files Patch % Lines
...ecorp/armeria/internal/common/TlsProviderUtil.java 73.61% 10 Missing and 9 partials ⚠️
.../armeria/internal/common/util/CertificateUtil.java 40.00% 12 Missing ⚠️
...n/java/com/linecorp/armeria/common/TlsKeyPair.java 72.50% 9 Missing and 2 partials ⚠️
...ava/com/linecorp/armeria/server/ServerBuilder.java 60.71% 5 Missing and 6 partials ⚠️
...a/com/linecorp/armeria/client/HttpChannelPool.java 64.28% 5 Missing and 5 partials ⚠️
...om/linecorp/armeria/common/DefaultTlsProvider.java 74.19% 7 Missing and 1 partial ⚠️
...corp/armeria/common/metric/CertificateMetrics.java 75.75% 8 Missing ⚠️
...ia/common/metric/AbstractCloseableMeterBinder.java 50.00% 7 Missing ⚠️
...om/linecorp/armeria/common/TlsProviderBuilder.java 68.42% 5 Missing and 1 partial ⚠️
.../java/com/linecorp/armeria/common/TlsProvider.java 16.66% 5 Missing ⚠️
... and 19 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5228      +/-   ##
============================================
- Coverage     74.75%   73.62%   -1.14%     
+ Complexity    21409    20168    -1241     
============================================
  Files          1877     1748     -129     
  Lines         79126    74634    -4492     
  Branches      10201     9526     -675     
============================================
- Hits          59152    54946    -4206     
+ Misses        15179    15139      -40     
+ Partials       4795     4549     -246     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ikhoon ikhoon changed the title Provide a way to dynamically update TLS certificates Provide a way to dynamically update TLS certificates for server and client Jan 3, 2024
@ikhoon
Copy link
Contributor

ikhoon commented Jan 3, 2024

This PR is ready to review. PTAL. The PR description has also been updated for the API design and implementation details.

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good! Server side looks good to me 👍 Left some minor questions on the client side 🙇

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks promising! 👍

@ikhoon ikhoon removed this from the 1.27.0 milestone Jan 24, 2024
Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some preliminary comments, looks good overall 👍

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through the client-side changes. Will take a look at the rest of the code next week 🙇


Bootstraps(HttpClientFactory clientFactory, EventLoop eventLoop, SslContext sslCtxHttp1Or2,
SslContext sslCtxHttp1Only) {
Bootstraps(HttpClientFactory clientFactory, EventLoop eventLoop,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: At first I wasn't sure whether TlsProvider should be called 1) on each connection open 2) or on each request (so TlsCacheContext is added to PoolKey).

After thinking about the most common use-cases, I think the current approach is more performant and probably covers most use-cases.

i.e.

  • If a server is simply advertising its validity, then I users can just use maxConnectionAge and allow clients to change the used tls certificate gradually.
  • If mTLS is used for authorization, new connections will probably be authorized anyways.

It may be difficult to add a tls related API to the WebClient level with this approach though. (we may also consider a hybrid approach where both the connection and poolkey is taken into account if needed in the future)

Comment on lines 104 to 115
@Deprecated
default boolean allowsUnsafeCiphers() {
return false;
}

/**
* Returns the {@link Consumer} which can arbitrarily configure the {@link SslContextBuilder} that will be
* applied to the SSL session.
*/
default Consumer<SslContextBuilder> tlsCustomizer() {
return builder -> {};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question) What do you think of modifying the purpose of TlsProvider to return a SslContext instead of a KeyPair?

By taking this approach, we can 1) delegate the cache logic in TlsProviderUtil to users if needed 2) we potentially don't need to make breaking changes in ClientFactory 3) maybe just me, but the purpose of this interface seems clearer.

e.g.

class TlsProvider {
  SslContext find(String hostname, boolean allowsUnsafeCiphers, Consumer<SslContextBuilder> tlsCustomizer);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will move the responsibility of managing the life cycle of SslContext to TlsProvider, which may complicate things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several difficulties in doing so.

  • TlsProvider is designed to be used at both client and server. However, SslContext has different properties for them, e.g., SslContextBuilder.forClient() and SslContextBuilder.forServer(). SslContext for a client should not be returned for a server.
  • Creating a SslContext in Armeria optimized for HTTP/2 and TLS 1.3. Users who implement other own TlsProvder may feel difficult to build SslContext or we need to expose a factory method to easily create it. Instead, creating TlsKeyPair is straightforward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will move the responsibility of managing the life cycle of SslContext to TlsProvider, which may complicate things.

What do you think of modifying the method for creating the SSL context so that the TlsProvider doesn't have to manage the lifecycle at all? Here's a proposed interface change:

interface TlsProvider {
  SslContext createSslContext(String hostname, SessionProtocol sessionProtocol, TlsEngineType tlsEngineType);
}

I'm suggesting this because it seems like the current TlsProvider functions more as a data class than as a provider.

TlsProvider is designed to be used at both client and server.

At present, we can use TlsProvider.ofServer() for the client side as well. By validating the TlsProvider when it's assigned to a client or a server, we can prevent this misuse.
If you still prefer to use TlsProvider for both, we could add enum parameters to TlsEngineType such as CLIENT_JDK, SERVER_JDK, CLIENT_OPENSSL, etc.

Users who implement their own TlsProvider may find it difficult to build SslContext, or we might need to expose a factory method to make it easier.

That is true. I also worry about exposing Netty's SslContext API to the public. However, creating an SslContext might not be too difficult since Netty provides a well-designed SslContextBuilder API. To address the issue of exposing the API, we could provide our own abstraction, similar to what we did with DnsEndpointGroupBuilder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still questioning about exposing an API that creates SslContext.

  • The API forces all users who want to implement TlsProvider to learn how to build SslContext.
  • With the current API, users only need to convert their certificates using TlsKeyPair's factory method for basic usage.
  • Advanced users may change SslContext using the customizer.

I prefer the current style because would be more convenient for most users.

Copy link
Contributor

@jrhee17 jrhee17 Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a couple of points here:

On the argument of returning SslContext vs TlsKeyPair

Honestly I'm find with either way. Later on, I think we can allow a higher level API if users want

On TlsProvider methods

Say that I am a user implementing a custom TlsProvider - I'm not sure if it is possible to gracefully change certificates.

e.g. If I want from TlsKeyPair1 where allowsUnsafeCiphers=true should be guaranteed, to TlsKeyPair2 where allowsUnsafeCiphers=false, can I do this atomically? Is it possible that TlsKeyPair2 is sent with allowsUnsafeCiphers=true?

public interface TlsProvider {
  boolean allowsUnsafeCiphers()
  TlsKeyPair find(String hostname)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I split the TLS configuration parts from TlsProvider and added ServerTlsConfig and ClientTlsConfig. Those configurations are immutable objects which return a constant allowsUnsafeCiphers and other values.

Comment on lines 104 to 115
@Deprecated
default boolean allowsUnsafeCiphers() {
return false;
}

/**
* Returns the {@link Consumer} which can arbitrarily configure the {@link SslContextBuilder} that will be
* applied to the SSL session.
*/
default Consumer<SslContextBuilder> tlsCustomizer() {
return builder -> {};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will move the responsibility of managing the life cycle of SslContext to TlsProvider, which may complicate things.

Comment on lines 104 to 115
@Deprecated
default boolean allowsUnsafeCiphers() {
return false;
}

/**
* Returns the {@link Consumer} which can arbitrarily configure the {@link SslContextBuilder} that will be
* applied to the SSL session.
*/
default Consumer<SslContextBuilder> tlsCustomizer() {
return builder -> {};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will move the responsibility of managing the life cycle of SslContext to TlsProvider, which may complicate things.

What do you think of modifying the method for creating the SSL context so that the TlsProvider doesn't have to manage the lifecycle at all? Here's a proposed interface change:

interface TlsProvider {
  SslContext createSslContext(String hostname, SessionProtocol sessionProtocol, TlsEngineType tlsEngineType);
}

I'm suggesting this because it seems like the current TlsProvider functions more as a data class than as a provider.

TlsProvider is designed to be used at both client and server.

At present, we can use TlsProvider.ofServer() for the client side as well. By validating the TlsProvider when it's assigned to a client or a server, we can prevent this misuse.
If you still prefer to use TlsProvider for both, we could add enum parameters to TlsEngineType such as CLIENT_JDK, SERVER_JDK, CLIENT_OPENSSL, etc.

Users who implement their own TlsProvider may find it difficult to build SslContext, or we might need to expose a factory method to make it easier.

That is true. I also worry about exposing Netty's SslContext API to the public. However, creating an SslContext might not be too difficult since Netty provides a well-designed SslContextBuilder API. To address the issue of exposing the API, we could provide our own abstraction, similar to what we did with DnsEndpointGroupBuilder.

@ikhoon ikhoon modified the milestones: 1.30.0, 1.31.0 Aug 1, 2024
@github-actions github-actions bot added the Stale label Sep 9, 2024
@ikhoon ikhoon changed the title Provide a way to dynamically update TLS certificates for server Provide a way to dynamically update TLS certificates Sep 20, 2024
@ikhoon
Copy link
Contributor

ikhoon commented Sep 24, 2024

This PR is ready to review again.

@ikhoon ikhoon removed the Stale label Sep 24, 2024
Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! 👍 👍 👍 Left minor comments and questions. 🙇


final ImmutableList<X509Certificate> trustedCerts = x509CertificateBuilder.build();
if (keyPairMappings.size() == 1 && keyPairMappings.containsKey("*")) {
return new StaticTlsProvider(keyPairMappings.get("*"), trustedCerts);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To pass the NullAway?

Suggested change
return new StaticTlsProvider(keyPairMappings.get("*"), trustedCerts);
final TlsKeyPair tlsKeyPair = keyPairMappings.get("*");
assert tlsKeyPair != null;
return new StaticTlsProvider(tlsKeyPair, trustedCerts);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me keep it as is since NullAway didn't complain about this.


final Bootstrap baseBootstrap = isDomainSocket ? unixBaseBootstrap : inetBaseBootstrap;
assert baseBootstrap != null;
return newBootstrap(baseBootstrap, remoteAddress, desiredProtocol, serializationFormat);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we should create bootstrap every time when connecting if sslContextFactory is set?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Because we need a new ChannelInitializer for a different SslContext.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, I'm a bit worried about the performance. 🤔
Maybe @trustin can give some idea.

Copy link
Contributor

@ikhoon ikhoon Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I saw Bootstrap, I thought it was a simple class and did not refer to heavy objects.

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 🙇 🙇 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Specify a client TLS key pair without ClientFactory
6 participants