Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

‼️ NOTICE: (ecs-patterns) NetworkLoadBalancedFargateService and NetworkLoadBalancedEc2Service cause target group replacement #24642

Closed
otaviomacedo opened this issue Mar 16, 2023 · 2 comments
Labels
@aws-cdk/aws-ecs-patterns Related to ecs-patterns library bug This issue is a bug. management/tracking Issues that track a subject or multiple issues p0

Comments

@otaviomacedo
Copy link
Contributor

otaviomacedo commented Mar 16, 2023

Please add your +1 👍 to let us know you have encountered this

Status: WORKAROUND ONLY

Overview

Versions 1.141.0 and 2.9.0 (released Jan 27, 2022) introduced a change in NetworkLoadBalancedFargateService and NetworkLoadBalancedEc2Service that will may the replacement of a Load Balancing TargetGroup. If you are currently running on a version of <1.141.0 or <2.9.0 and you are upgrading to a newer version of ecs-patterns, this change may cause interruption to your service for a couple of minutes. See below whether this affects you.

Since the change was released, many users have come to depend on the new behavior. There is no way for us to revert to the old behavior without additionally breaking users who are running stacks that have been deployed using a version released after January, 2022.

Therefore, we recommend that if you will be affected by this change, that you put an override in place before performing the upgrade of ecs-patterns from a version before 1.141.0/2.9.0 to a version after.

We are sorry for the disturbance this change has caused. Please know that we take breaking changes seriously and are constantly evolving our checks for them.

Am I affected?

TargetGroups used to be created with default port as 80, irrespective of the containerPort you have configured. A change was introduced that made the TargetGroup's port equal to the containerPort you configure. This leads to the resource replacement.

You are affected if the value for containerPort in your code is different from 80. For example:

const service = new NetworkLoadBalancedFargateService(stack, 'MyFargateService', {
  taskImageOptions: {
    containerPort: 8443,   // <-- you are affected if this value is different from 80
    image: ContainerImage.fromRegistry('amazon/amazon-ecs-sample'),
  }
});

In this example, the value 8443 used to be ignored (the TargetGroup would be created with port 80, which is fine because ECS will register the correct port for each container as it's registering the tasks with the load balancer), but in newer versions of ecs-patterns the value 8443 will be copied onto the TargetGroup's port.

If this property is not defined or is set to 80, your application won't be affected.

Workaround

Use an escape hatch to force the port number of the underlying CfnTargetGroup to 80, regardless of the value of containerPort:

const service = new NetworkLoadBalancedFargateService(stack, 'MyFargateService', {
  taskImageOptions: {
    containerPort: 8443,
    image: ContainerImage.fromRegistry('amazon/amazon-ecs-sample'),
  }
});

const targetGroup = service.targetGroup.node.defaultChild as CfnTargetGroup;
targetGroup.port = 80;

After you have made this change, you can safely upgrade.

@otaviomacedo otaviomacedo added bug This issue is a bug. management/tracking Issues that track a subject or multiple issues p0 @aws-cdk/aws-ecs-patterns Related to ecs-patterns library labels Mar 16, 2023
@otaviomacedo otaviomacedo pinned this issue Mar 16, 2023
@otaviomacedo otaviomacedo changed the title ‼️ NOTICE: (ecs-patterns) NetworkLoadBalancedFargateService and NetworkLoadBalancedEcsService cause target group replacement ‼️ NOTICE: (ecs-patterns) NetworkLoadBalancedFargateService and NetworkLoadBalancedEc2Service cause target group replacement Mar 16, 2023
otaviomacedo added a commit to cdklabs/aws-cdk-notices that referenced this issue Mar 16, 2023
mergify bot pushed a commit to cdklabs/aws-cdk-notices that referenced this issue Mar 16, 2023
@jasdrive1
Copy link

I had some questions about this that I got answered and through I should share with others looking at this issue:

Summary:

It boils down to:

  • If you previously hard coded targetGroup.port there is nothing to worry about and no action to take.
  • If you have NOT hard-coded targetGroup.port add a hard code of that to be port 80. This way when you upgrade CDK versions in the future, it will not change from the current implicit mapping of port 80 to the new implicit mapping targetGroup.port -> containerPort. Instead by explicitly mapping it to 80, the targetGroup.port will NOT change on version upgrade (which can cause downtime).

More Context:

How can I be sure I did this correctly?

Running a cdk diff after you upgrade your cdk version and checking that targetgroup.port isn’t modified

How will this issue cause downtime?

When the TG is replaced, the original resource is deleted and then the new one is created. Your service would be unable to receive traffic in that interval because the TG has information on internal IPs of tasks, their ports and protocols, etc. So it’s a necessary blocking construct in between the ALB listener and the service tasks.

you can mentally model it as “all traffic has to flow through the TG to get load balanced correctly”

Why does the issue reference port 80 to be used?

There’s no particular reason that the TG port has to be 80, it appears that it was just used as a placeholder since in CDK, when targets register themselves with a TG they provide the port themselves. In the CFN Docs:

Port
The port on which the targets receive traffic. This port is used unless you specify a port override when registering the target.

CDK self-registering targets always specify a port.Unfortunately, for some reason, Port is a replacement field on a target group CFN resource. So that’s why you need to force it to be 80, to avoid replacement on a new deployment.You can check whether your CFN template will be affected by running cdk diff after upgrading CDK version and checking to be sure that the TargetGroup.Port field is unchanged.

@MrArnoldPalmer MrArnoldPalmer unpinned this issue Apr 17, 2023
@mrgrain mrgrain closed this as not planned Won't fix, can't repro, duplicate, stale Aug 24, 2023
@aws aws deleted a comment from github-actions bot Aug 24, 2023
@mrgrain
Copy link
Contributor

mrgrain commented Aug 24, 2023

Closed as not planned as per the reasons in the issue. We have a CLI notice in place to inform users about this issue, and it can still be found via search.

Unfortunately comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a discussion/issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-ecs-patterns Related to ecs-patterns library bug This issue is a bug. management/tracking Issues that track a subject or multiple issues p0
Projects
None yet
Development

No branches or pull requests

3 participants