-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an autoscaling group for the docs-rs-builder #243
Conversation
device_name = "/dev/sda1" | ||
|
||
ebs { | ||
volume_size = 64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing the current filesystem usage (~100 GB) I would prefer something at least double this size.
( of course the current usage also includes the database & some web cache)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we didn't need so much storage. I believe a large part of that storage is only needed during a single crate's build and afterwards can be deleted, no? Would it be possible to add some clean up to the builder process so that the filesystem usage doesn't grow so large?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm.. we are cleaning up after the build.
Plus the cleanup tasks for docker images, which are in cron right now.
( btw, cc @Nemo157 @jyn514 , these cronjobs would need to be configured in our ansible images too, right? )
Only looking at the above I could totally imagine to just try with the current definition above, let the builder build, and watch how much space is used. ( assuming the big docker image is configured?)
But, we're also planning on adding some build artifact caching: rust-lang/docs.rs#1757
( of course we could increase storage only then, when that feature is finished)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we just have a daily cronjob (systemd-timer) running docker container prune --force && docker image prune --force
(and cargo-sweep which shouldn't be necessary if we rebuild the image for a new version?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the cronjob currently configured? I can add this to the Ansible configuration (though I wouldn't block merging this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in
/etc/systemd/system/prune-disk-space.service
/etc/systemd/system/prune-disk-space.timer
two questions:
|
Correct, the autoscaling currently is whatever the default is for ec2 instance health checks which I believe is super basic (e.g., if the instance goes into serviced mode). This is definitely not what we want, but I'd like to do that as a follow-up PR.
Correct |
68709ae
to
f4a394b
Compare
f4a394b
to
743d0e8
Compare
min_num_builder_instances = 1 | ||
max_num_builder_instances = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused what the autoscaling does when you've pinned it to always be at one instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It assures that there's one healthy instance. So if one instance stops or gets terminated a new one boots.
This adds an autoscaling group for the docs-rs-builder.
Currently this works by grabbing the latest docs-rs-builder AMI, creating a launch template with that AMI, and then using that template to make an autoscaling group.
I'm not sure we want to actually deploy this way in the long term, but I think it's a good start for testing how the autoscaling group behaves in practice.