Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add symlink to latest version for amazon-ssm-agent #3986

Merged
merged 1 commit into from
May 30, 2024

Conversation

koooosh
Copy link
Contributor

@koooosh koooosh commented May 24, 2024

Issue: #3988

v2

Description of changes:
Making the following modifications to the amazon-ssm-agent package:

  • Creating a latest symlink so that downstream users can easily pick up the most recent agent version

Reason for revision: As per offline discussion with @arnaldo2792 and @bcressey, we are removing the dynamically-linked agent binaries for now since it currently comes with an explicit dependency on bash. We want to make this bash dependency optional so they will be re-added in a later patch.

Testing done:

For the aws-ecs variants, verified instance is able to join a cluster and ECS Exec succeeds.

Verified symlink exists in the correct location:

[root@admin]# sudo sheltie
bash-5.1# ls /usr/libexec/amazon-ssm-agent/bin/
3.3.418.0  latest

v1

Description of changes:
Making the following modifications to the amazon-ssm-agent package:

  • Changing the base package to vendor dynamically-linked agent binaries
  • Adding this dynamically-linked agent to the aws-dev variant
  • Adding a subpackage which vendors statically-linked agent binaries
  • Creating a latest symlink so that downstream users can easily pick up the most recent agent version
  • Adding this statically-linked agent to all aws-ecs variants

Testing done:

For the aws-dev variant, verified that SSM can be used to connect to the instance. Also, SSM send-command succeeds.

For the aws-ecs variants, verified instance is able to join a cluster and ECS Exec succeeds.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@koooosh koooosh requested a review from arnaldo2792 May 24, 2024 00:51
@koooosh koooosh self-assigned this May 24, 2024
@koooosh koooosh force-pushed the modify-ssm-agent branch from b531341 to 69c3403 Compare May 24, 2024 08:34
@koooosh koooosh marked this pull request as ready for review May 24, 2024 15:48
@koooosh koooosh requested review from bcressey, larvacea and KCSesh May 24, 2024 15:52
Copy link
Member

@larvacea larvacea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

# requested), don't restart at all.
Restart=always
RestartPreventExitStatus=194
RestartSec=90
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we could get away with a lower value if this became a concern. I am absolutely fine with 90 seconds until we learn otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this also appears to be their default (confirmed myself) which I think is a good starting point!

Copy link
Contributor

@KCSesh KCSesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

"glibc",
]

[[package]]
name = "aws-dev"
version = "0.1.0"
dependencies = [
"amazon-ssm-agent",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ideally, the commit that modified the aws-dev variant should include the changes to the Cargo.lock file that add the SSM agent.

Comment on lines 17 to 19

[dependencies]
bash = { path = "../bash" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this about? I don't like this change. bash is only for the *-dev variants; it's not allowed as a package dependency.

What I'm not seeing here that I expected to see based on previous discussions is a patch so SSM agent doesn't need bash to execute commands. Ideally that would be some sort of environment or config-driven toggle, and the default would be to not require bash (because it's not present on production variants).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When testing, I checked this and saw that bash is present in the aws-dev variant only and NOT present in the ecs variants:

I enabled admin container and disabled control container before SSHing into the instances.

## aws-dev variant instance
 
[ec2-user@admin]$ sudo sheltie
bash-5.1# find /usr/bin -name "bash" 
/usr/bin/bash
bash-5.1# 
 
## ecs variant instance
 
[ec2-user@admin]$ sudo sheltie
bash-5.1# find /usr/bin -name "bash"
bash-5.1# 

Regarding the SSM agent patch to remove bash dependency, that task is still in the works. An initial effort to do that was unsuccessful, so we need more time to complete that. @arnaldo2792 can provide you with more details on this.

Comment on lines 9 to 15
# Restart the agent regardless of whether it crashes (and returns a non-zero result code) or if
# is terminated normally (e.g. via 'kill -HUP'). Delay restart so that the agent is less likely
# to restart during a reboot initiated by a script. If the agent exits with status 194 (reboot
# requested), don't restart at all.
Restart=always
RestartPreventExitStatus=194
RestartSec=90
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic doesn't make sense to me. A properly initiated reboot should cause systemd to transition to shutdown.target which will cause this service to stop (because of default dependencies).

I'm OK with the special treatment for the 194 exit code, but the delay isn't good. It means if the agent crashes for whatever reason it will take 90 seconds to come back up, which is an eternity if SSM is the only mechanism for managing the node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with lowering the restart time, but we would be altering the default SSM agent configuration. Are we okay with this? If so, what restart time do you recommend?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with lowering the restart time, but we would be altering the default SSM agent configuration. Are we okay with this? If so, what restart time do you recommend?

I don't have a specific recommendation, just the feedback that 90 seconds is too long. You can look at other RestartSec values in packages/*/*.service to get a sense of the acceptable ranges.

Whatever value you pick, you should back up with data along the lines of "I inspected the ssm agent source code and I'm confident this is fine because ..." or "I tested the script-initiated reboot path in Bottlerocket, inspected the system journal, and confirmed that no unexpected restarts were triggered even with a lower restart time."

@@ -18,6 +18,7 @@ included-packages = [
# core
"release",
"kernel-5.10",
"amazon-ssm-agent-plugin",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to remove the ecs-agent.spec dependency on amazon-ssm-agent, or else you will automatically install that package on all ECS variants.

You should replace that dependency with the plugin dependency, so that ecs-agent takes care of its own requirements and variants don't have to specify them.

@koooosh
Copy link
Contributor Author

koooosh commented May 29, 2024

Please see updated description under v2 for recent changes

@koooosh koooosh changed the title Create different variations of the amazon-ssm-agent based on compilation type Modify statically-linked amazon-ssm-agent and add it to ECS variants May 29, 2024
@bcressey
Copy link
Contributor

ecs-agent already requires amazon-ssm-agent so I don't understand the purpose of this change. It seems to just involve renaming subpackages with amazon-ssm-agent in the current form.

Signed-off-by: Kush Upadhyay <kushupad@amazon.com>
@koooosh koooosh force-pushed the modify-ssm-agent branch from b6c44ba to fc2ab81 Compare May 29, 2024 22:57
@koooosh koooosh changed the title Modify statically-linked amazon-ssm-agent and add it to ECS variants Add symlink to latest version for amazon-ssm-agent May 29, 2024
@koooosh koooosh merged commit cb0be44 into bottlerocket-os:develop May 30, 2024
33 checks passed
@koooosh koooosh deleted the modify-ssm-agent branch July 15, 2024 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants