[WORKFLOWS-44] Add Managed Scaling #53

tthyer · 2021-09-02T00:14:54Z

Adds a CapacityProvider to implement managed scaling. This allows the cluster to add an instance which will accommodate a new, updated task before spinning down the old one. It takes 5-6 minutes for this turnover to happen and then another ~15 minutes to shut down the second cluster instance. Possibly we could shorten the latter by adjusting TargetCapacity -- but this seems to be working nicely now.

… when deploying task definition revisions

BrunoGrandePhD

I'm not an ECS expert, but I don't see anything wrong here. 🚀

zaro0508

looks good in general. can approve as-is but just had a few questions.

zaro0508 · 2021-09-02T17:22:44Z

templates/nextflow-ecs-cluster.yaml

+      TerminationPolicies:
+        - OldestLaunchConfiguration
+        - OldestInstance


any reason why not just let AWS manage this with Default setting?

Yes, I want to be explicit about how this is going to behave so that anyone supporting it knows what to expect. It's also a little different from the default behavior. You can look that up.

zaro0508 · 2021-09-02T17:33:05Z

templates/nextflow-ecs-cluster.yaml

+          MinimumScalingStepSize: 1
+          MaximumScalingStepSize: 1
+          TargetCapacity: 90


The docs seem to indicate that if status is set to enabled then AWS auto scales for you. I think that means that it will just ignore these scaling params? or am i reading it wrong?..

The ManagedScaling property specifies the settings for the Auto Scaling group capacity provider. When managed scaling is enabled, Amazon ECS manages the scale-in and scale-out actions of the Auto Scaling group. Amazon ECS manages a target tracking scaling policy using an Amazon ECS-managed CloudWatch metric with the specified targetCapacity value as the target value for the metric. For more information, see Using Managed Scaling in the Amazon Elastic Container Service Developer Guide. If managed scaling is disabled, the user must manage the scaling of the Auto Scaling group.

@zaro0508 Good question. It's the opposite of how you're reading it; in my experience with clusters there's always some metric or metrics used as the measure so that it knows when to autoscale. Now, TargetCapacity is not a required value so I assume there's some default value they're using, but they don't specify what it is. I went with 90 because the aws-samples code I was looking at used that value; it works well enough so I left it at that.

Look at this set of instructions for creating the capacity provider in the console -- there's more explanation at #9. You can also read the deep dive blog post.

Add an ECS Capacity Provider with managed scaling to allow scaling up…

cc7e87f

… when deploying task definition revisions

tthyer marked this pull request as ready for review September 2, 2021 01:09

tthyer requested a review from a team as a code owner September 2, 2021 01:09

tthyer requested a review from a team September 2, 2021 01:09

BrunoGrandePhD approved these changes Sep 2, 2021

View reviewed changes

zaro0508 approved these changes Sep 2, 2021

View reviewed changes

tthyer merged commit c1a5ca1 into main Sep 2, 2021

tthyer deleted the WORKFLOWS-44 branch September 2, 2021 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WORKFLOWS-44] Add Managed Scaling #53

[WORKFLOWS-44] Add Managed Scaling #53

tthyer commented Sep 2, 2021 •

edited

Loading

BrunoGrandePhD left a comment

zaro0508 left a comment

zaro0508 Sep 2, 2021

tthyer Sep 2, 2021

zaro0508 Sep 2, 2021

tthyer Sep 2, 2021

[WORKFLOWS-44] Add Managed Scaling #53

[WORKFLOWS-44] Add Managed Scaling #53

Conversation

tthyer commented Sep 2, 2021 • edited Loading

BrunoGrandePhD left a comment

Choose a reason for hiding this comment

zaro0508 left a comment

Choose a reason for hiding this comment

zaro0508 Sep 2, 2021

Choose a reason for hiding this comment

tthyer Sep 2, 2021

Choose a reason for hiding this comment

zaro0508 Sep 2, 2021

Choose a reason for hiding this comment

tthyer Sep 2, 2021

Choose a reason for hiding this comment

tthyer commented Sep 2, 2021 •

edited

Loading