Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop to master - clean up OpsWorks replacement #546

Merged
merged 33 commits into from
Jul 2, 2024
Merged

Develop to master - clean up OpsWorks replacement #546

merged 33 commits into from
Jul 2, 2024

Conversation

ThrawnCA
Copy link
Contributor

  • Add dynamic scaling policy to autoscaling groups
  • Remove all OpsWorks components

ThrawnCA and others added 30 commits May 8, 2024 15:53
- Shared yum packages are preinstalled, 4GB swapfile has been created, Supervisord service enabled.
No environment-specific changes have been made.
- Recreating Solr instances doesn't preserve the index. TODO Fix index sync so new instances can pick up the latest index.
[QOLDEV-819] exclude Solr instances from power management
…tion references

- Can't rename CloudFormation stacks at this point, or else they will be deleted and recreated,
which would be unnecessarily disruptive
[QOLDEV-839] drop OpsWorks resources (stack and layers) and documentation references
[QOLDEV-867] add autoscaling policy to target 50% CPU utilisation
- Automatic install during SSM Run Command sometimes fails without logging the reason
QOLDEV-892 update sandbox to Amazon Linux 2023
[QOLDEV-892] update all DEV environments to Amazon Linux 2023
[QOLDEV-892] update cookbook to support Amazon Linux 2023
- If desired capacity is already at minimum, attempting to put a server in standby without replacement will fail.
Detect this condition and spawn a replacement.
- Set minimum and desired capacities to the same value, now that our deployments can handle that
- Drop redundant config that is equivalent to the defaults
QOLDEV-867 make deployments handle autoscaling more robustly
[QOLDEV-902] update Archiver to ensure QA runs on uploaded files
[QOLDEV-892] update cookbook to retain support Amazon Linux 2
@@ -166,24 +166,28 @@ Resources:
echo '/dev/sdi /mnt/local_data xfs defaults,nofail 0 2' >> /etc/fstab
mount -a
fi
if ! (yum install chef); then
for i in `seq 1 5`; do
yum install -y libxcrypt-compat "https://packages.chef.io/files/stable/chef/14.15.6/el/7/chef-14.15.6-1.el7.x86_64.rpm" && break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add aws doc's or similar on which version we need to be locked to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually set v14 before discovering that we needed libxcrypt-compat. It appears that we can use the latest v18 instead.

Copy link
Member

@duttonw duttonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall like some more comments, but should be doable.

as for increment DECREMENT_BEHAVIOUR, f it's set to increment, it should add and then wait for it to be stable before continuing the deployment. (yes it will slow the deployment by X mins but it also means 0 downtime.

ThrawnCA added 2 commits July 1, 2024 16:13
- This appears to work now that we have the XCrypt compatibility library
@ThrawnCA
Copy link
Contributor Author

ThrawnCA commented Jul 1, 2024

as for increment DECREMENT_BEHAVIOUR, f it's set to increment, it should add and then wait for it to be stable before continuing the deployment. (yes it will slow the deployment by X mins but it also means 0 downtime.

Ah, that's actually taken care of already. We now set the minimum instance count equal to the desired count, instead of 1 less. So, in any environment that requests at least 2 instances - which is the default and is applicable to all production environments - there will still be at least one fully operational instance at all times.

Previously, production might have a desired count of 2, minimum of 1, and deployments would put one into Standby while deploying to it, leaving the other to carry the load.

Now, production will have both desired and minimum counts of 2, and deployments will put one into standby while launching a new one. The remaining instance will still carry the load in the meantime.

QOLDEV-892 update Chef client and improve handling of instances already in Standby
@ThrawnCA ThrawnCA merged commit 6d21ed3 into master Jul 2, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants