Skip to content

Commit

Permalink
fix(cli): cannot hotswap ECS task definitions containing certain intr…
Browse files Browse the repository at this point in the history
…insics (aws#26404)

## Reason for this change
Currently an ECS task definition cannot be hotswapped in many cases, for example when it contains references to certain AWS resources as environment variables. This PR removes the limitation and make hotswaps possible in much more various situations.

Specifically, if there is any token that is not part of the following, we cannot hotswap a task definition:

1. Ref functions, which will be resolved as physical resource ids
2. GetAtt functions for some resources and attributes (manually supported one by one, [code](https://github.com/aws/aws-cdk/blob/5ccc56975c323ea19fd0917def51184e13f440d9/packages/aws-cdk/lib/api/evaluate-cloudformation-template.ts#L352))
3. several simple CFn functions (join, split, select, sub)
4. parameters like AWS::AccountId, Region, Partition, or UrlSuffix

Although it is not supported much, using `GetAtt` function in a task definition is very common (imagine you reference other resource's property as an environment variable). This PR allows to hotswap a task definition even if it contains these tokens.

## Solution
To hotswap a task definition, we need to construct a task definition to call `registerTaskDefinition` API. For this, we have to [evaluate](https://github.com/aws/aws-cdk/blob/5ccc56975c323ea19fd0917def51184e13f440d9/packages/aws-cdk/lib/api/evaluate-cloudformation-template.ts#L134) CloudFormation template locally to resolve all the intrinsics in the template. However, some intrinsics such as `Fn::GetAtt` is not fully supported by CDK CLI because we have to manually support them for each AWS resource type. 

The reason why some task definitions are unhotswappable is that there are such intrinsics in the template and the CDK fails to evaluate it locally. So the basic idea to overcome this limitation in this PR is that we don't try to evaluate it locally, but we fetch the latest task definition already deployed and get the required values from it.

Here's how we can implement the idea.

### How we determine if changes to a task definition can be hotswapped
In the hotswap process, we have to decide whether the change can be hotswapped or not. Now we can hotswap the task definition if

1. there are only changes in `ContainerDefinitions` field, and all the fields in the task definition can be evaluated locally. (original behavior) OR,
2. there are only changes in `ContainerDefinitions` field, and all the **updated** field can be evaluated locally (added in this PR).

The first condition can actually be included in the second condition, but for now I keep it as-is to avoid the possibility of breaking the existing behavior.

If the second condition is true, we fetch the latest task definition from AWS account, override the updated fields, and register a new task definition to update a service. By this way, we don't have to evaluate tokens in unchanged fields locally, allowing to use hotswap in much more situations.

### How we compare the old and new task definition
Here is an example task definition:

```json
{
    "ContainerDefinitions": [
        {
            "Environment": [
                {
                    "Name": "VPC_ID",
                    "Value": {
                        "Fn::GetAtt": [
                            "Vpc8378EB38",
                            "CidrBlock"
                        ]
                    }
                }
            ],
            "Essential": true,
            "Image": "nginx:stable",
            "Name": "EcsApp"
        }
    ],
    "Cpu": "256",
    "Family": "myteststackTask289798EC",
    "Memory": "512",
    "NetworkMode": "awsvpc",
    "RequiresCompatibilities": [
        "FARGATE"
    ],
    "TaskRoleArn": {
        "Fn::GetAtt": [
            "TaskTaskRoleE98524A1",
            "Arn"
        ]
    }
}
```

We compare the old and new task definition in the following steps:

1. Check if there are only changes in `ContainerDefinitions` field. If not, we cannot hotswap.
2. Try `evaluateCfnExpression` on the containerDefinitons. If it can be evaluated, proceed to hotswap. If not, proceed to step 3.
3. Check if the length of `ContainerDefinitions` is the same. If not, we cannot hotswap.
4. For each container definition, deep-compare each key (e.g. `Environment`, `Image`, `Name`, etc)
5. For each key, if there is any diff in the corresponding value, try `evaluateCfnExpression` on the value. If the evaluation fails, we cannot hotswap.
6. After checking all the keys and there is no field that cannot be hotswapped, proceed to hotswap.

Imagine if there is a change only in `Image` field (container image tag)  but `Environment` field contains unsupported intrinsics (e.g. `"Fn::GetAtt": ["Vpc8378EB38", "CidrBlock"]`). In the previous CDK CLI we cannot hotswap it due to an evaluation error. We can now hotswap it because we don't have to evaluate the `Environment` field when it has no diffs.

Closes aws#25563

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
  • Loading branch information
tmokmss authored Aug 17, 2023
1 parent 0b8f31d commit 6d315b8
Show file tree
Hide file tree
Showing 3 changed files with 734 additions and 193 deletions.
212 changes: 212 additions & 0 deletions packages/aws-cdk/lib/api/hotswap/common.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import * as cfn_diff from '@aws-cdk/cloudformation-diff';
import { ISDK } from '../aws-auth';
import { CfnEvaluationException, EvaluateCloudFormationTemplate } from '../evaluate-cloudformation-template';

export const ICON = '✨';

Expand Down Expand Up @@ -135,6 +136,13 @@ export function lowerCaseFirstCharacter(str: string): string {
return str.length > 0 ? `${str[0].toLowerCase()}${str.slice(1)}` : str;
}

/**
* This function upper cases the first character of the string provided.
*/
export function upperCaseFirstCharacter(str: string): string {
return str.length > 0 ? `${str[0].toUpperCase()}${str.slice(1)}` : str;
}

export type PropDiffs = Record<string, cfn_diff.PropertyDifference<any>>;

export class ClassifiedChanges {
Expand Down Expand Up @@ -213,3 +221,207 @@ export function reportNonHotswappableResource(
reason,
}];
}

type ChangedProps = {
/**
* Array to specify the property from an object.
* e.g. Given this object `{ 'a': { 'b': 1 } }`, the key array for the element `1` will be `['a', 'b']`
*/
key: string[];

/**
* Whether the property is added (also modified) or removed.
*/
type: 'removed' | 'added';

/**
* evaluated value of the property.
* undefined if type == 'removed'
*/
value?: any
};

function detectChangedProps(next: any, prev: any): ChangedProps[] {
const changedProps: ChangedProps[] = [];
changedProps.push(...detectAdditions(next, prev));
changedProps.push(...detectRemovals(next, prev));
return changedProps;
}

function detectAdditions(next: any, prev: any, keys: string[] = []): ChangedProps[] {
// Compare each value of two objects, detect additions (added or modified properties)
// If we encounter CFn intrinsic (key.startsWith('Fn::') || key == 'Ref'), stop recursion

if (typeof next !== 'object') {
if (next !== prev) {
// there is an addition or change to the property
return [{ key: new Array(...keys), type: 'added' }];
} else {
return [];
}
}

if (typeof prev !== 'object') {
// there is an addition or change to the property
return [{ key: new Array(...keys), type: 'added' }];
}

// If the next is a CFn intrinsic, don't recurse further.
const childKeys = Object.keys(next);
if (childKeys.length === 1 && (childKeys[0].startsWith('Fn::') || childKeys[0] === 'Ref')) {
if (!deepCompareObject(prev, next)) {
// there is an addition or change to the property
return [{ key: new Array(...keys), type: 'added' }];
} else {
return [];
}
}

const changedProps: ChangedProps[] = [];
// compare children
for (const key of childKeys) {
keys.push(key);
changedProps.push(...detectAdditions((next as any)[key], (prev as any)[key], keys));
keys.pop();
}
return changedProps;
}

function detectRemovals(next: any, prev: any, keys: string[] = []): ChangedProps[] {
// Compare each value of two objects, detect removed properties
// To do this, find any keys that exist only in prev object.
// If we encounter CFn intrinsic (key.startsWith('Fn::') || key == 'Ref'), stop recursion
if (next === undefined) {
return [{ key: new Array(...keys), type: 'removed' }];
}

if (typeof prev !== 'object' || typeof next !== 'object') {
// either prev or next is not an object nor undefined, then the property is not removed
return [];
}

// If the prev is a CFn intrinsic, don't recurse further.
const childKeys = Object.keys(prev);
if (childKeys.length === 1 && (childKeys[0].startsWith('Fn::') || childKeys[0] === 'Ref')) {
// next is not undefined here, so it is at least not removed
return [];
}

const changedProps: ChangedProps[] = [];
// compare children
for (const key of childKeys) {
keys.push(key);
changedProps.push(...detectRemovals((next as any)[key], (prev as any)[key], keys));
keys.pop();
}
return changedProps;
}

/**
* return true when two objects are identical
*/
function deepCompareObject(lhs: any, rhs: any): boolean {
if (typeof lhs !== 'object') {
return lhs === rhs;
}
if (typeof rhs !== 'object') {
return false;
}
if (Object.keys(lhs).length != Object.keys(rhs).length) {
return false;
}
for (const key of Object.keys(lhs)) {
if (!deepCompareObject((lhs as any)[key], (rhs as any)[key])) {
return false;
}
}
return true;
}

interface EvaluatedPropertyUpdates {
readonly updates: ChangedProps[];
readonly unevaluatableUpdates: ChangedProps[];
}

/**
* Diff each property of the changes, and check if each diff can be actually hotswapped (i.e. evaluated by EvaluateCloudFormationTemplate.)
* If any diff cannot be evaluated, they are reported by unevaluatableUpdates.
* This method works on more granular level than HotswappableChangeCandidate.propertyUpdates.
*
* If propertiesToInclude is specified, we only compare properties that are under keys in the argument.
*/
export async function evaluatableProperties(
evaluate: EvaluateCloudFormationTemplate,
change: HotswappableChangeCandidate,
propertiesToInclude?: string[],
): Promise<EvaluatedPropertyUpdates> {
const prev = change.oldValue.Properties!;
const next = change.newValue.Properties!;
const changedProps = detectChangedProps(next, prev).filter(
prop => propertiesToInclude?.includes(prop.key[0]) ?? true,
);
const evaluatedUpdates = await Promise.all(
changedProps
.filter((prop) => prop.type === 'added')
.map(async (prop) => {
const val = getPropertyFromKey(prop.key, next);
try {
const evaluated = await evaluate.evaluateCfnExpression(val);
return {
...prop,
value: evaluated,
};
} catch (e) {
if (e instanceof CfnEvaluationException) {
return prop;
}
throw e;
}
}));
const unevaluatableUpdates = evaluatedUpdates.filter(update => update.value === undefined);
evaluatedUpdates.push(...changedProps.filter(prop => prop.type == 'removed'));

return {
updates: evaluatedUpdates,
unevaluatableUpdates,
};
}

function getPropertyFromKey(key: string[], obj: object) {
return key.reduce((prev, cur) => (prev as any)?.[cur], obj);
}

function overwriteProperty(key: string[], newValue: any, target: object) {
for (const next of key.slice(0, -1)) {
if (next in target) {
target = (target as any)[next];
} else if (Array.isArray(target)) {
// When an element is added to an array, we need explicitly allocate the new element.
target = {};
(target as any)[next] = {};
} else {
// This is an unexpected condition. Perhaps the deployed task definition is modified outside of CFn.
return false;
}
}
if (newValue === undefined) {
delete (target as any)[key[key.length - 1]];
} else {
(target as any)[key[key.length - 1]] = newValue;
}
return true;
}

/**
* Take the old template and property updates, and synthesize a new template.
*/
export function applyPropertyUpdates(patches: ChangedProps[], target: any) {
target = JSON.parse(JSON.stringify(target));
for (const patch of patches) {
const res = overwriteProperty(patch.key, patch.value, target);
if (!res) {
throw new Error(`failed to applying patch to ${patch.key.join('.')}. Please try deploying without hotswap first.`);
}
}
return target;
}
107 changes: 81 additions & 26 deletions packages/aws-cdk/lib/api/hotswap/ecs-services.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import * as AWS from 'aws-sdk';
import { ChangeHotswapResult, classifyChanges, HotswappableChangeCandidate, lowerCaseFirstCharacter, reportNonHotswappableChange, transformObjectKeys } from './common';
import { ChangeHotswapResult, classifyChanges, HotswappableChangeCandidate, lowerCaseFirstCharacter, reportNonHotswappableChange, transformObjectKeys, upperCaseFirstCharacter, applyPropertyUpdates, evaluatableProperties } from './common';
import { ISDK } from '../aws-auth';
import { EvaluateCloudFormationTemplate } from '../evaluate-cloudformation-template';

Expand All @@ -16,7 +16,8 @@ export async function isHotswappableEcsServiceChange(
// We only allow a change in the ContainerDefinitions of the TaskDefinition for now -
// it contains the image and environment variables, so seems like a safe bet for now.
// We might revisit this decision in the future though!
const classifiedChanges = classifyChanges(change, ['ContainerDefinitions']);
const propertiesToHotswap = ['ContainerDefinitions'];
const classifiedChanges = classifyChanges(change, propertiesToHotswap);
classifiedChanges.reportNonHotswappablePropertyChanges(ret);

// find all ECS Services that reference the TaskDefinition that changed
Expand All @@ -33,7 +34,8 @@ export async function isHotswappableEcsServiceChange(
// if there are no resources referencing the TaskDefinition,
// hotswap is not possible in FALL_BACK mode
reportNonHotswappableChange(ret, change, undefined, 'No ECS services reference the changed task definition', false);
} if (resourcesReferencingTaskDef.length > ecsServicesReferencingTaskDef.length) {
}
if (resourcesReferencingTaskDef.length > ecsServicesReferencingTaskDef.length) {
// if something besides an ECS Service is referencing the TaskDefinition,
// hotswap is not possible in FALL_BACK mode
const nonEcsServiceTaskDefRefs = resourcesReferencingTaskDef.filter(r => r.Type !== 'AWS::ECS::Service');
Expand All @@ -44,26 +46,76 @@ export async function isHotswappableEcsServiceChange(

const namesOfHotswappableChanges = Object.keys(classifiedChanges.hotswappableProps);
if (namesOfHotswappableChanges.length > 0) {
const taskDefinitionResource = await prepareTaskDefinitionChange(evaluateCfnTemplate, logicalId, change);
const familyName = await getFamilyName(evaluateCfnTemplate, logicalId, change);
if (familyName === undefined) {
reportNonHotswappableChange(ret, change, undefined, 'Failed to determine family name of the task definition', false);
return ret;
}
const oldTaskDefinitionArn = await evaluateCfnTemplate.findPhysicalNameFor(logicalId);
if (oldTaskDefinitionArn === undefined) {
reportNonHotswappableChange(ret, change, undefined, 'Failed to determine ARN of the task definition', false);
return ret;
}

const changes = await evaluatableProperties(evaluateCfnTemplate, change, propertiesToHotswap);
if (changes.unevaluatableUpdates.length > 0) {
reportNonHotswappableChange(ret, change, undefined, `Found changes that cannot be evaluated locally in the task definition - ${
changes.unevaluatableUpdates.map(p => p.key.join('.')).join(', ')
}`, false);
return ret;
}

ret.push({
hotswappable: true,
resourceType: change.newValue.Type,
propsChanged: namesOfHotswappableChanges,
service: 'ecs-service',
resourceNames: [
`ECS Task Definition '${await taskDefinitionResource.Family}'`,
`ECS Task Definition '${familyName}'`,
...ecsServicesReferencingTaskDef.map(ecsService => `ECS Service '${ecsService.serviceArn.split('/')[2]}'`),
],
apply: async (sdk: ISDK) => {
// Step 1 - update the changed TaskDefinition, creating a new TaskDefinition Revision
// we need to lowercase the evaluated TaskDef from CloudFormation,
// as the AWS SDK uses lowercase property names for these

// The SDK requires more properties here than its worth doing explicit typing for
// instead, just use all the old values in the diff to fill them in implicitly
const lowercasedTaskDef = transformObjectKeys(taskDefinitionResource, lowerCaseFirstCharacter, {
// All the properties that take arbitrary string as keys i.e. { "string" : "string" }
// https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_RegisterTaskDefinition.html#API_RegisterTaskDefinition_RequestSyntax
// get the task definition of the family and revision corresponding to the old CFn template
const target = await sdk
.ecs()
.describeTaskDefinition({
taskDefinition: oldTaskDefinitionArn,
include: ['TAGS'],
})
.promise();
if (target.taskDefinition === undefined) {
throw new Error(`Could not find a task definition: ${oldTaskDefinitionArn}. Try deploying without hotswap first.`);
}

// The describeTaskDefinition response contains several keys that must not exist in a registerTaskDefinition request.
// We remove these keys here, comparing these two structs:
// https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_RegisterTaskDefinition.html#API_RegisterTaskDefinition_RequestSyntax
// https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_DescribeTaskDefinition.html#API_DescribeTaskDefinition_ResponseSyntax
[
'compatibilities',
'taskDefinitionArn',
'revision',
'status',
'requiresAttributes',
'compatibilities',
'registeredAt',
'registeredBy',
].forEach(key=> delete (target.taskDefinition as any)[key]);

// the tags field is in a different location in describeTaskDefinition response,
// moving it as intended for registerTaskDefinition request.
if (target.tags !== undefined && target.tags.length > 0) {
(target.taskDefinition as any).tags = target.tags;
delete target.tags;
}

// Don't transform the properties that take arbitrary string as keys i.e. { "string" : "string" }
// https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_RegisterTaskDefinition.html#API_RegisterTaskDefinition_RequestSyntax
const excludeFromTransform = {
ContainerDefinitions: {
DockerLabels: true,
FirelensConfiguration: {
Expand All @@ -79,7 +131,14 @@ export async function isHotswappableEcsServiceChange(
Labels: true,
},
},
});
} as const;
// We first uppercase the task definition to properly merge it with the one from CloudFormation template.
const upperCasedTaskDef = transformObjectKeys(target.taskDefinition, upperCaseFirstCharacter, excludeFromTransform);
// merge evaluatable diff from CloudFormation template.
const updatedTaskDef = applyPropertyUpdates(changes.updates, upperCasedTaskDef);
// lowercase the merged task definition to use it in AWS SDK.
const lowercasedTaskDef = transformObjectKeys(updatedTaskDef, lowerCaseFirstCharacter, excludeFromTransform);

const registerTaskDefResponse = await sdk.ecs().registerTaskDefinition(lowercasedTaskDef).promise();
const taskDefRevArn = registerTaskDefResponse.taskDefinition?.taskDefinitionArn;

Expand Down Expand Up @@ -171,14 +230,15 @@ interface EcsService {
readonly serviceArn: string;
}

async function prepareTaskDefinitionChange(
evaluateCfnTemplate: EvaluateCloudFormationTemplate, logicalId: string, change: HotswappableChangeCandidate,
) {
async function getFamilyName(
evaluateCfnTemplate: EvaluateCloudFormationTemplate,
logicalId: string,
change: HotswappableChangeCandidate) {
const taskDefinitionResource: { [name: string]: any } = {
...change.oldValue.Properties,
ContainerDefinitions: change.newValue.Properties?.ContainerDefinitions,
};
// first, let's get the name of the family
// first, let's get the name of the family
const familyNameOrArn = await evaluateCfnTemplate.establishResourcePhysicalName(logicalId, taskDefinitionResource?.Family);
if (!familyNameOrArn) {
// if the Family property has not been provided, and we can't find it in the current Stack,
Expand All @@ -189,17 +249,12 @@ async function prepareTaskDefinitionChange(
// remove it if needed
const familyNameOrArnParts = familyNameOrArn.split(':');
const family = familyNameOrArnParts.length > 1
// familyNameOrArn is actually an ARN, of the format 'arn:aws:ecs:region:account:task-definition/<family-name>:<revision-nr>'
// so, take the 6th element, at index 5, and split it on '/'
// familyNameOrArn is actually an ARN, of the format 'arn:aws:ecs:region:account:task-definition/<family-name>:<revision-nr>'
// so, take the 6th element, at index 5, and split it on '/'
? familyNameOrArnParts[5].split('/')[1]
// otherwise, familyNameOrArn is just the simple name evaluated from the CloudFormation template
// otherwise, familyNameOrArn is just the simple name evaluated from the CloudFormation template
: familyNameOrArn;
// then, let's evaluate the body of the remainder of the TaskDef (without the Family property)
return {
...await evaluateCfnTemplate.evaluateCfnExpression({
...(taskDefinitionResource ?? {}),
Family: undefined,
}),
Family: family,
};
// then, let's evaluate the body of the remainder of the TaskDef (without the Family property)

return family;
}
Loading

0 comments on commit 6d315b8

Please sign in to comment.