Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hot reloading tls certificates #2363

Open
cobalthex opened this issue Oct 12, 2022 · 19 comments
Open

Hot reloading tls certificates #2363

cobalthex opened this issue Oct 12, 2022 · 19 comments
Assignees
Labels
accepted An accepted request or suggestion request Request for new functionality
Milestone

Comments

@cobalthex
Copy link

Is your feature request motivated by a concrete problem? Please describe.

I am not sure if this feature already exists, but most SSL certs have an expiration and it would be nice to have the web server able to reload the certs (either automatically from the file specified, or manually by my own code)

Why this feature can't or shouldn't live outside of Rocket

Rocket implements TLS

Ideal Solution

Rocket provides a way to supply tls after launch, or supports hot reloading

@cobalthex cobalthex added the request Request for new functionality label Oct 12, 2022
@cobalthex cobalthex changed the title Hot reloading tls config Hot reloading tls certificates Oct 12, 2022
@SergioBenitez SergioBenitez added the accepted An accepted request or suggestion label Mar 29, 2023
@cobalthex
Copy link
Author

Not sure if this is a second ticket worthy but it would be nice to perhaps provide support for this via a callback which can allow for returning different values based on the incoming domain

@SergioBenitez
Copy link
Member

Not sure if this is a second ticket worthy but it would be nice to perhaps provide support for this via a callback which can allow for returning different values based on the incoming domain

Agreed. I think the ideal interface here is one where both Rocket can provide some default functionality (like updating the TLS certs if they change on disk) but applications can plug in their own dynamic TLS functionality at will as well. I don't have a concrete proposal for an api just yet, but I'm more than in favor of this change. Perhaps this can be part of #1070?

@cobalthex
Copy link
Author

Sure, ideally as long as it's not all managed or all user driven (ie it would be nice to say here's my tls function without having to manage every aspect of the connection manually)

@SergioBenitez
Copy link
Member

Sure, ideally as long as it's not all managed or all user driven (ie it would be nice to say here's my tls function without having to manage every aspect of the connection manually)

I don't think that interface would be a problem. The custom listener could simply forward the majority of the implementation to an existing listener.

@GentBinaku
Copy link

Can I work on this?

@SergioBenitez
Copy link
Member

SergioBenitez commented Apr 28, 2023

Can I work on this?

Yes! I'll post a quick guide on how to do so in the next day or so. In the meantime, it would be ideal for you to familiarize yourself with the http/tls module in core/http in the Rocket side of things, and
Acceptor and/or ResolvesServerCert on the rustls side of things.

@SergioBenitez
Copy link
Member

SergioBenitez commented May 2, 2023

Here's a guide on how to work on this issue. If you decide to work on this, please let me know. I'll assign the issue.

Getting Familiar

The first step, as always, is to get familiar with the relevant code.

  • Rocket's http crate contains all of the relevant HTTP server handling code, including handling TLS connections and certificates. The latter functionality is contained in the tls module.

  • Of particular importance is TlsListener, which is initialized by the main Rocket and handles all TLS connections:

    use crate::http::tls::TlsListener;
    
    let conf = config.to_native_config().map_err(ErrorKind::Io)?;
    let l = TlsListener::bind(addr, conf).await.map_err(ErrorKind::Bind)?;
    addr = l.local_addr().unwrap_or(addr);
    self.config.address = addr.ip();
    self.config.port = addr.port();
    ready(&mut self).await;
    return self.http_server(l).await;

You should read through the TlsListener code first and make sure you understand how it works. Particularly, the TlsListener::bind() static method sets up certificate handling using rustls' with_single_cert.

Using Dynamic Certificate Resolution

TlsListener::bind() will need to be changed so that the rustls' struct uses dynamic TLS resolution as opposed to the current static resolution via with_single_cert. There are a few options:

  1. Continue to use the builder and change with_single_cert to with_cert_resolver.

    This is the easiest approach. Going this route would require implementing the ResolvesServerCert trait. The implementation would likely need to run a task in the background to detect changes. That task, and the implementing resolver struct, will need to synchronize in some way to keep certificates up to date. This synchronization should ideally be lock-free and likely involve a single atomic swap when new certificates are available.

  2. Modify the entire processing chain to use the more flexible Acceptor. If you choose to go this route, or need to go this route, you'll likely need to adapt quite a bit of code to work with this. I wouldn't suggest going this route unless you need to.

Suggested Approach

  1. Read the code. Make sure you understand it.
  2. Implement a ResolvesServerCert struct that always resolve to the same cert/key. This means we add no new functionality yet but keep the existing functionality while having a path for the new features.
  3. Test that everything works as expected via the tls example. You'll need to run this example directly and check it in your browser/external client. We don't have an automated way to test this, unfortunately.
  4. Have your dynamic resolver spawn a task which changes certificates based on some simple condition (say, after 30s). Ensure that the resolver uses the "current" certificates all the time. This is just to test that everything related to synchronization and background tasks works as expected. The background task should be fully synchronous, and the resolver should never block waiting for anything. Synchronization should be as free as possible: we can and should expect at most N (N cores) cache line transfer if a cert is changed (once to write max one line in the background, once to read in the foreground). We could even do a single transfer by using N slots, if desired, to avoid a stampede, but we could also likely use relaxed atomics for even better performance.
  5. Make the background task update certs if the configured certs change. This will require choosing some way to be notified of changes. It's unlikely we want to poll or be notified by the disk - we may want to ignore some changes. A common approach is to reload based on a signal, say USR1. That's a good approach for now.

Bonus

If you're feeling adventurous, it would be amazing to add tests that truly exercise the TLS listener. This would mean bringing in an HTTP client library, starting up a real server, and making real requests to that server. This would be easier if we could set custom listeners, since then we wouldn't need to open a TCP connection, but we don't have that yet.

@GentBinaku
Copy link

@SergioBenitez I will work on the weekend As during the weekdays I am busy with C++

@martynp
Copy link

martynp commented Dec 22, 2023

@SergioBenitez Is there still a need for adding tls tests in /core/lib/tests? (I see there are some tests in the ./examples/tls project

hcldan added a commit to hcldan/Rocket that referenced this issue Mar 12, 2024
hcldan added a commit to hcldan/Rocket that referenced this issue Mar 12, 2024
hcldan added a commit to hcldan/Rocket that referenced this issue Mar 12, 2024
SergioBenitez added a commit that referenced this issue Apr 17, 2024
This commit introduces the ability to dynamically select a TLS
configuration based on the client's TLS hello.

Added `Authority::set_port()`.
Various `Config` structures for listeners removed.
`UdsListener` is now `UnixListener`.
`Bindable` removed in favor of new `Bind`.
`Connection` requires `AsyncRead + AsyncWrite` again
The `Debug` impl for `Endpoint` displays the underlying address in plaintext.
`Listener` must be `Sized`.
`tls` listener moved to `tls::TlsListener`
The preview `quic` listener no longer implements `Listener`.

All built-in listeners now implement `Bind<&Rocket>`.

Clarified docs for `mtls::Certificate` guard.

No reexporitng rustls from `tls`.
Added `TlsConfig::server_config()`.

Added some future helpers: `race()` and `race_io()`.

Fix an issue where the logger wouldn't respect a configuration during error
printing.

Added Rocket::launch_with(), launch_on(), bind_launch().

Added a default client.pem to the TLS example.

Revamped the testbench.

Added tests for TLS resolvers, MTLS, listener failure output.

TODO: clippy.
TODO: UDS testing.

Resolves #2730.
Resolves #2363.
Closes #2748.
Closes #2683.
Closes #2577.
SergioBenitez added a commit that referenced this issue Apr 17, 2024
This commit introduces the ability to dynamically select a TLS
configuration based on the client's TLS hello.

Added `Authority::set_port()`.
Various `Config` structures for listeners removed.
`UdsListener` is now `UnixListener`.
`Bindable` removed in favor of new `Bind`.
`Connection` requires `AsyncRead + AsyncWrite` again
The `Debug` impl for `Endpoint` displays the underlying address in plaintext.
`Listener` must be `Sized`.
`tls` listener moved to `tls::TlsListener`
The preview `quic` listener no longer implements `Listener`.

All built-in listeners now implement `Bind<&Rocket>`.

Clarified docs for `mtls::Certificate` guard.

No reexporitng rustls from `tls`.
Added `TlsConfig::server_config()`.

Added some future helpers: `race()` and `race_io()`.

Fix an issue where the logger wouldn't respect a configuration during error
printing.

Added Rocket::launch_with(), launch_on(), bind_launch().

Added a default client.pem to the TLS example.

Revamped the testbench.

Added tests for TLS resolvers, MTLS, listener failure output.

TODO: clippy.
TODO: UDS testing.

Resolves #2730.
Resolves #2363.
Closes #2748.
Closes #2683.
Closes #2577.
SergioBenitez added a commit that referenced this issue Apr 17, 2024
This commit introduces the ability to dynamically select a TLS
configuration based on the client's TLS hello.

Added `Authority::set_port()`.
Various `Config` structures for listeners removed.
`UdsListener` is now `UnixListener`.
`Bindable` removed in favor of new `Bind`.
`Connection` requires `AsyncRead + AsyncWrite` again
The `Debug` impl for `Endpoint` displays the underlying address in plaintext.
`Listener` must be `Sized`.
`tls` listener moved to `tls::TlsListener`
The preview `quic` listener no longer implements `Listener`.

All built-in listeners now implement `Bind<&Rocket>`.

Clarified docs for `mtls::Certificate` guard.

No reexporitng rustls from `tls`.
Added `TlsConfig::server_config()`.

Added some future helpers: `race()` and `race_io()`.

Fix an issue where the logger wouldn't respect a configuration during error
printing.

Added Rocket::launch_with(), launch_on(), bind_launch().

Added a default client.pem to the TLS example.

Revamped the testbench.

Added tests for TLS resolvers, MTLS, listener failure output.

TODO: clippy.
TODO: UDS testing.

Resolves #2730.
Resolves #2363.
Closes #2748.
Closes #2683.
Closes #2577.
@github-project-automation github-project-automation bot moved this from Backlog to Done in Rocket v0.6 Apr 17, 2024
@cobalthex
Copy link
Author

cobalthex commented May 6, 2024

👋 thanks for getting this in! Question wrt to configuring an app for TLS: TLS resolution happens via a Resolver, which does the user provided TLS lookup on request handshake. However, this requires that the app is configured for TLS, which (afaik) must be prefilled when starting the app up. This creates a bit of an awkward design, as Resolver::init() must supply a (I assume) valid configuration, and then again during resolve(), particularly when trying to serve certs for multiple domains

@SergioBenitez
Copy link
Member

This creates a bit of an awkward design, as Resolver::init() must supply a (I assume) valid configuration, and then again during resolve(), particularly when trying to serve certs for multiple domains

No, this part isn't correct. init() is about initialization the Resolver. It doesn't need to actually produce a TlsConfig/ServerConfig -- that part can happen entirely in a lazy fashion at request-time via resolve. For instance, the following is entirely valid:

struct MyResolver(Option<Arc<ServerConfig>>);

#[rocket::async_trait]
impl Resolver for MyResolver {
    async fn init(rocket: &Rocket<Build>) -> tls::Result<Self> {
        Ok(MyResolver(None))
    }

    async fn resolve(&self, hello: ClientHello<'_>) -> Option<Arc<ServerConfig>> {
        self.0.clone()
    }
}

Assuming something replaces the Option with a Some at some point, then this produces a ServerConfig lazily.

@cobalthex
Copy link
Author

cobalthex commented May 6, 2024

The issue I am seeing is starting up the HTTPS redirector based on the example from this repo:

🔒 HTTP -> HTTPS Redirector:
   >> Main instance is not being served over TLS/TCP.
   >> Redirector refusing to start.

and this while fielding requests
Warning: request failed: invalid HTTP method parsed (I am assuming it's trying to parse HTTPS as HTTP)

@SergioBenitez
Copy link
Member

Okay, I think I understand now. As it stands, TLS is only enabled if you provide a valid TLS configuration. That includes certificates/keys which are used as the fallback if there is no resolver or the resolver returns None.

It sounds like you want to use TLS without what is currently considered to be a complete TLS configuration opting instead to only have a resolver. Is that right?

@cobalthex
Copy link
Author

Ideally yeah

@the10thWiz
Copy link
Collaborator

How should we handle the case where a Resolver returns None? Currently, we fall back to some static certificate & keys configured on startup. I don't think we should change the resolver trait to return a certificate (rather than an option). There are valid reasons why a cert resolver wouldn't be able to return a valid certificate, trivially, if no such certificate exists.

We could make the static certificate optional, however, we would likely need to refuse any connection that doesn't return a certificate. Is this reasonable? We can't return a graceful error, our only options are a 'connection refused' or 'tls error'. That being said, it's quite reasonable to assume that a single running Rocket instance will outlive any certificate it was initially configured with, in which case we cannot have a valid static certificate configured when the instance started.

I think the best option would be to continue requiring users to supply a default certificate. We could add a note in the documentation that it is only used if the resolver does not return a certificate. In this case, it might not be a bad idea to suggest that using an expired certificate is not an issue. (It means cert resolver failures are reported as tls certificate expired).

@cobalthex
Copy link
Author

I would be fine with returning a TLS error/refused if there were no certificates returned. It would also be ok to provide (optional) fallbacks if one is fine w/ expired certs.

@SergioBenitez
Copy link
Member

I believe rustls allows creating a ServerConfig without a known cert/key and only a resolver. I don't see any documentation about what it does if the resolver can't return a valid configuration. @the10thWiz are you able to experiment and see what rustls does in that case?

@SergioBenitez
Copy link
Member

I'll reopen this while we investigate the feasibility of resolver-only configs.

@SergioBenitez SergioBenitez reopened this May 6, 2024
@the10thWiz
Copy link
Collaborator

After some tinkering, it looks like rustls refuses the connection. Firefox reports SSL_ERROR_ACCESS_DENIED_ALERT, and the server logs no server certificate chain resolved.

As far as I can tell, this is the same error as rejecting an mtls connection because we choose not to accept the client's certificate (At least on the client side). Obviously, the server logs explain exactly why the connection was refused.

I'm not sure what the correct option is here. I'd like to allow Resolver-only configs, especially since I don't think we can rely on a default cert to remain valid for the entire lifetime of the server. Right now, I see two options for behavior if we don't have a valid certificate: either we follow rustls's example, or we reset the tcp connection. Right now, I'm leaning towards the rustls behavior, since it makes it clear to the client that the issue is related to TLS certificates.

No matter what we choose, the client side of this is not pretty. We have very limited options in how we can respond without a valid certificate, so it might just be a case of picking an option, and logging an error that actually explains the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted An accepted request or suggestion request Request for new functionality
Projects
Status: Done
Development

No branches or pull requests

5 participants