Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out docker hub caching #176

Open
ehuss opened this issue Oct 26, 2024 · 2 comments · May be fixed by rust-lang/rust#134135
Open

Figure out docker hub caching #176

ehuss opened this issue Oct 26, 2024 · 2 comments · May be fixed by rust-lang/rust#134135
Assignees

Comments

@ehuss
Copy link

ehuss commented Oct 26, 2024

In GitHub Actions we periodically have problems hitting the Docker Hub rate limit which was introduced in November 2020 (error is "429 Too Many Requests"). This hits any repo using Docker (such as rust-lang/rust, and rust-lang/cargo).

The anonymous Docker Hub rate limit is 100 pulls / 6 hours / IP. source

There have been a few solutions proposed:

  1. Authenticate with Docker Hub. This changes the rate limit to 200 pulls / 6 hours / account. I do not know if that is sufficient for our needs across the org. I get the impression that infra team members do not like this option.
  2. Mirror in GitHub Container registry (docs). From what I can tell, there doesn't seem to be a read limit. There is a 10GB/layer limit, which I think is fine. The main drawback is that it requires manually updating the images (like when new Ubuntu images are released).
    • I do not think we update the base images very often. I do not know if the infra team has a pre-existing mechanism for uploading, or how difficult that is to do manually.
  3. Use Amazon ECR. I think the ECR Public Gallery has many of the images we typically use. I am uncertain, but I think the unauthenticated rate limit is 500GB/month (per IP?) source. We could authenticate using OIDC, which raises the cap to 5 TB / month.

I do not know what the performance and reliability compares between ghcr and amazon ecr.

Would the infra team have a preference here? I prefer whatever is easiest 😜. ghcr seems appealing to me if the infra team is ok with handling uploading new images.

@Mark-Simulacrum
Copy link
Member

Authenticate with Docker Hub. This changes the rate limit to 200 pulls / 6 hours / account. I do not know if that is sufficient for our needs across the org. I get the impression that infra team members do not like this option.

Yeah, I'd prefer to avoid ~personal/team accounts on Docker Hub, seems like unnecessary hassle, and 200 pulls / 6 hours also doesn't feel that high that this fully solves the problem.

Use Amazon ECR. I think the ECR Public Gallery has many of the images we typically use. I am uncertain, but I think the unauthenticated rate limit is 500GB/month (per IP?) source. We could authenticate using OIDC, which raises the cap to 5 TB / month.

It is per IP: " *** Data transferred out from public repositories is limited by source IP when an AWS account is not used." (https://aws.amazon.com/ecr/pricing/)

5 TB isn't a cap, it's just the free tier. Past that we start paying, but I'd expect that in practice we wouldn't use much beyond 5 TB (if at all, that's a pretty large amount of data).

For ECR, if it's not in the existing public gallery, we could probably configure pull through caching (https://docs.aws.amazon.com/AmazonECR/latest/userguide/pull-through-cache.html), though it sounds like that would require authentication. I'm much more comfortable with not ending up needing multiple Docker hub accounts to distribute load (as seems likely if e.g. rust-lang/rust uses this).

I do not think we update the base images very often. I do not know if the infra team has a pre-existing mechanism for uploading, or how difficult that is to do manually.

I think base images get updated pretty regularly? At least I'd expect that e.g. ubuntu:22.04 is getting updates constantly -- it was updated just 8 days ago (way after initial release) https://hub.docker.com/layers/library/ubuntu/22.04/images/sha256-3d1556a8a18cf5307b121e0a98e93f1ddf1f3f8e092f1fddfd941254785b95d7?context=explore

@MarcoIeni MarcoIeni self-assigned this Dec 10, 2024
@MarcoIeni
Copy link
Member

This is on my todo list now 👍
I will switch from dockerhub to aws ecr as agreed here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants