Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(deadline): use TLS between RenderQueue and clients by default #491

Merged
merged 6 commits into from
Jul 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 7 additions & 5 deletions integ/lib/render-struct.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@ import { X509CertificatePem } from 'aws-rfdk';
import {
IRepository,
RenderQueue,
RenderQueueHostNameProps,
RenderQueueProps,
RenderQueueTrafficEncryptionProps,
ThinkboxDockerRecipes,
} from 'aws-rfdk/deadline';
import { ThinkboxDockerImageOverrides } from './ThinkboxDockerImageOverrides';
Expand Down Expand Up @@ -52,9 +54,9 @@ export class RenderStruct extends Construct {
const maxLength = 64 - host.length - '.'.length - suffix.length - 1;
const zoneName = Stack.of(this).stackName.slice(0, maxLength) + suffix;

let trafficEncryption: any;
let hostname: any;
let cacert: any;
let trafficEncryption: RenderQueueTrafficEncryptionProps | undefined;
let hostname: RenderQueueHostNameProps | undefined;
let cacert: X509CertificatePem | undefined;

// If configured for HTTPS, the render queue requires a private domain and a signed certificate for authentication
if( props.protocol === 'https' ) {
Expand All @@ -72,8 +74,8 @@ export class RenderStruct extends Construct {
},
signingCertificate: cacert,
}),
internalProtocol: ApplicationProtocol.HTTP,
},
internalProtocol: ApplicationProtocol.HTTP,
};
hostname = {
zone: new PrivateHostedZone(this, 'Zone', {
Expand All @@ -83,7 +85,7 @@ export class RenderStruct extends Construct {
hostname: host,
};
} else {
trafficEncryption = undefined;
trafficEncryption = { externalTLS: { enabled: false } };
hostname = undefined;
}

Expand Down
1 change: 1 addition & 0 deletions packages/aws-rfdk/docs/upgrade/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ applications. The documentation is separated by RFDK versions that included pote
upgrading to (or beyond) a version listed below, you should consult the the linked upgrade documentation.

* [`0.27.x`](./upgrading-0.27.md)
* [`0.37.x`](./upgrading-0.37.md)
55 changes: 55 additions & 0 deletions packages/aws-rfdk/docs/upgrade/upgrading-0.37.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Upgrading to RFDK v0.37.x or Newer
jusiskin marked this conversation as resolved.
Show resolved Hide resolved

Starting in RFDK v0.37.0, the default for TLS between the render queue and its clients, which is configured using the `RenderQueueExternalTLSProps` interface that the `RenderQueue` construct takes as a part of its constructor props, is now set to be enabled.

## Upgrading Farms Already Using TLS

If you are already setting fields on the `RenderQueueExternalTLSProps` for the Render Queue, no action is required. Redeploying your render farm after upgrading your version of RFDK should have no effect.

## Upgrading Farms Not Using TLS
jusiskin marked this conversation as resolved.
Show resolved Hide resolved

To upgrade your farm if it does not currently configure TLS for connections to the Render Queue, there are two options:

1. [Migrating to TLS](#migrating-to-tls) (recommended)
1. [Preserving plain HTTP](#preserving-plain-http)

### Migrating to TLS

#### RenderQueue Changes

Versions of RFDK prior to 0.37.0 had internal TLS between the load balancer and its backing services on by default. This is configurable with the `internalProtocol` field on the `RenderQueueTrafficEncryptionProps` interface. This default was left as-is, so upgrading RFDK will have no effect on the protocol those backing services were already using and they will not need to be replaced. The TLS being enabled by default is between the listener on the load balancer and any Deadline clients that are connecting to it, which is configurable with the `externalProtocol` property on the `RenderQueueTrafficEncryptionProps` interface.

There will be a few new constructs deployed to your farm:
1. A `PrivateHostedZone` will be created if you do not supply your own. We set the default domain to `aws-rfdk.com`, which we have registered and suggest that you use if you do not have your own registered domain. [RFC 6762](https://datatracker.ietf.org/doc/html/rfc6762#appendix-G) recommends against using any unregistered top-level domains.
1. A self-signed X509 certificate will be generated using OpenSSL and that will then be used to sign a certificate that the Render Queue will use for TLS. Specifically, the certificate will be passed to the Application Listener for the Application Load Balancer that the Render Queue creates. Additional details about how RFDK uses TLS can the built-in certificate management can be found in the developer guide for [Encryption in transit](https://docs.aws.amazon.com/rfdk/latest/guide/security-encrypt-in-transit.html).

These new constructs will require the Render Queue load balancer's listener to need replacing, but the load balancer itself and the backing services it redirects traffic to will not need to be changed.

#### WorkerInstanceFleet and SpotEventPluginFleet Changes

Since the endpoint and port the listener on the load balancer uses will be changed, and the TLS will require any clients connecting to verify its certificate, any stacks that contain dependencies on the Render Queue will first need to be destroyed. If you are using a tiered architecture similar to what we recommend in our documentation, this would include any `WorkerInstanceFleet` constructs or `SpotEventPluginFleet` and `ConfigureSpotEventPlugin` constructs. If you are not using a tiered architecture, we still recommend destroying these constructs since we're changing the endpoint that they need to connect to, and that configuration of the endpoint happens in the initialization script for an instance.

To perform the removal of these constructs:
1. Suspend any jobs that are being run by workers that the constructs deployed.
2. If you are using the `ConfigureSpotEventPlugin` and `SpotEventPluginFleet` constructs, then any spot fleets launched by the Spot Event Plugin will need to be terminated, since their lifecycle is controlled by Deadline's Spot Event Plugin and not your RFDK app. Trying to destroy/remove the `SpotEventPluginFleet` construct will fail if these hosts are left running because the spot instances use the security group that the construct creates.
3. Next we have to destroy the constructs that are deploying workers, which could be done in a few ways:
1. If you can destroy the Stack that contains these constructs without destroying the rest of your app, then destroy it using the command `cdk destroy "ComputeTier"` (or whatever name you gave your stack).
2. If you cannot destroy a single stack or you are not using a tiered architecture, you can just comment out these constructs in your app, rebuild it, and then run `cdk deploy "*"` to perform the removal.
4. Now that we won't cause any dependency issues, we can upgrade the version of RFDK and then run the deployment. If worker constructs were commented out in the last step, they can be added back in here and redeployed during the upgrade.
5. Any jobs that were paused can be resumed after the deployment is complete. Any workers deployed with the `WorkerInstanceFleet` should be connecting through TLS now, and the Spot Event Plugin should be configured so that any new Spot instances it deploys will be properly configured as well.

### Preserving plain HTTP

While we strongly suggest farms be upgraded to use TLS, it is possible to override the new default and keep a farm using HTTP instead. To do this, there is an `enabled` field on the `RenderQueueExternalTLSProps` that can be set to false. This will prevent the farm from automatically upgrading the protocol until you decide you're ready. Here's an example of creating a Render Queue with TLS disabled:

```ts
new RenderQueue(this, 'RenderQueue', {
vpc,
images,
repository,
version,
trafficEncryption: {
externalTLS: { enabled: false },
},
});
```
17 changes: 13 additions & 4 deletions packages/aws-rfdk/lib/deadline/lib/render-queue-ref.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,10 @@ export interface RenderQueueHostNameProps {
readonly hostname?: string;

/**
* The private zone to which the DNS A record for the render queue will be added.
* The private zone to which the DNS A record for the render queue will be added. We do not recommend
* using an unregistered domain for your PrivateHostedZone and we have registered aws-rfdk.com that
* can be used if you do not own your own. Refer to RFC 6762 Appendix G for more details about private
* DNS namespaces: https://datatracker.ietf.org/doc/html/rfc6762#appendix-G
*/
readonly zone: IPrivateHostedZone;
}
Expand Down Expand Up @@ -148,9 +151,15 @@ export interface RenderQueueHealthCheckConfiguration {
* In both cases the certificate chain **must** include only the CA certificates PEM file due to a known limitation in Deadline.
*/
export interface RenderQueueExternalTLSProps {
/**
* Whether to enable TLS between the Render Queue and Deadline clients.
* @default true
*/
readonly enabled?: boolean;

/**
* The ACM certificate that will be used for establishing incoming external TLS connections to the RenderQueue.
* @default If not provided then the rfdkCertificate must be provided.
* @default If rfdkCertificate and acmCertificate are both not provided when TLS is enabled, an rfdkCertificate will be generated and used.
*/
readonly acmCertificate?: ICertificate;

Expand All @@ -166,7 +175,7 @@ export interface RenderQueueExternalTLSProps {
/**
* The parameters for an X509 Certificate that will be imported into ACM then used by the RenderQueue.
*
* @default If not provided then an acmCertificate and acmCertificateChain must be provided.
* @default If rfdkCertificate and acmCertificate are both not provided when TLS is enabled, an rfdkCertificate will be generated and used.
*/
readonly rfdkCertificate?: IX509CertificatePem;
}
Expand Down Expand Up @@ -281,7 +290,7 @@ export interface RenderQueueProps {
/**
* Hostname to use to connect to the RenderQueue.
*
* @default A hostname is generated by the Application Load Balancer that fronts the RenderQueue.
* @default - The hostname `renderqueue` will be used and a PrivateHostedZone will be created with the domain name `aws-rfdk.com`
*/
readonly hostname?: RenderQueueHostNameProps;

Expand Down
Loading