Skip to content
This repository has been archived by the owner on Jan 8, 2024. It is now read-only.

AWS ECS Runner Uninstall Fix #4792

Merged
merged 3 commits into from
Jun 12, 2023
Merged

Conversation

paladin-devops
Copy link
Contributor

@paladin-devops paladin-devops commented Jun 8, 2023

This PR fixes an error which sometimes occurred during a runner uninstall for AWS ECS runners:

❌ Uninstalling runner...
✓ Finding runner in ECS services...
! InvalidParameter: 1 validation error(s) found.

This occurred when no AWS EFS file systems were detected which had a tag waypoint-runner, with the value of the runner's ID assigned to it. This PR fixes that by properly looping through all paginated results of AWS EFS file systems to find a match, and later use that match correctly during deletion. If no matching file systems are found, runner uninstall will skip attempting to delete it.

There was also a possible resource leak with a defer inside of a for loop, which has been fixed.

Successful output looks like:

$ waypoint runner uninstall -platform=ecs -id=ABC123 -ecs-cluster=dev -ecs-region=us-east-1
✓ Runner "ABC123" uninstalled successfully
✓ Runner "ABC123" forgotten on server
✓ Waypoint runner AWS ECS service deleted
✓ Runner file system deleted

@paladin-devops paladin-devops added plugin/ecs ecosystem Things related to waypoint interacting with external systems backport/0.11.x labels Jun 8, 2023
@paladin-devops paladin-devops self-assigned this Jun 8, 2023
@github-actions github-actions bot added the core label Jun 8, 2023
The runner uninstall for AWS will now loop correctly through all file systems' paginated results using a marker. Some other updates are made in this commit to skip deleting the file system if certain conditions are met (if there are no file systems that exist, or none with the right tag). Step groups were also updated.
@paladin-devops paladin-devops force-pushed the b-ecs-runner-uninstall-file-system branch from 79fe273 to c596458 Compare June 8, 2023 20:59
@paladin-devops paladin-devops marked this pull request as ready for review June 8, 2023 20:59
@paladin-devops paladin-devops requested a review from a team as a code owner June 8, 2023 20:59
Copy link
Contributor

@catsby catsby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicks but nothing major that I see yet

done = true
} else {
marker = *fileSystemsResp.NextMarker
time.Sleep(5 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need the sleep here since we are not waiting for an installation or otherwise status change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this sleep because during testing, I triggered the API's rate limiting. That was before I had this loop working correctly, though so it shouldn't still happen. I will go ahead and remove this.

efsSvc := efs.New(sess)
marker := ""
done := false
var fileSystems []*efs.FileSystemDescription
for !done {
fileSystemsResp, err := efsSvc.DescribeFileSystems(&efs.DescribeFileSystemsInput{
Marker: aws.String(marker),
req := &efs.DescribeFileSystemsInput{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could possibly reduce this function by using DescribeFileSystemsPages to handle the pagination logic for us, so something like this:

var fileSystems []*efs.FileSystemDescription
err := client.DescribeFileSystemsPages(
  &efs.DescribeFileSystemsInput{}, // don't need to set marker. MaxItems also defaults to 100
  func(page *efs.DescribeFileSystemsOutput, lastPage bool) bool {
		fileSystems = append(fileSystems, page.FileSystems...)
		
    // double check that lastPage is the API telling us it's the last page though
    return lastPage 
})

fileSystemId = fileSystem.FileSystemId
goto DeleteFileSystem
if len(fileSystems) == 0 {
s.Update("No file systems detected, skipping deletion")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this scenario effectively skips us down to the return way down around line 462, the else is unnecessary if we simply add a return nil here, and it improves readability by reducing the mental stack people will make when reading by 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you here, will update this!

To add a bit of context as to why I did it this way, initially I had both cases of this if statement hitting the same step.Done() at the end of this function, just before the return. I moved that inside the else case, so the step is completed before the return nil is hit when the file system gets deleted, but forgot to apply the same up here.

Copy link
Contributor

@catsby catsby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The license header needs to be restored

@@ -1,6 +1,3 @@
// Copyright (c) HashiCorp, Inc.
// SPDX-License-Identifier: MPL-2.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how I managed to remove those - added back.

Simplify runner uninstall func for ECS by removing an `if`.

Add the copyright headers back in.
Copy link
Contributor

@catsby catsby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for making those changes. Just a question remaining about the removed err check other than that ✅

@@ -411,19 +420,16 @@ DeleteFileSystem:
describeMountTargetsResp, err := efsSvc.DescribeMountTargets(&efs.DescribeMountTargetsInput{
FileSystemId: fileSystemId,
})
if err != nil {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to check this error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do! I lost this while moving other things around, but fixed up!

@paladin-devops
Copy link
Contributor Author

Thanks @catsby, pushed up that last err check and will merge after checks pass.

@paladin-devops paladin-devops merged commit a048593 into main Jun 12, 2023
@paladin-devops paladin-devops deleted the b-ecs-runner-uninstall-file-system branch June 12, 2023 15:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/0.11.x core ecosystem Things related to waypoint interacting with external systems plugin/ecs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants