Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate exceeded for aws ecr image #1400

Closed
Jeffwan opened this issue Sep 6, 2021 · 5 comments
Closed

Rate exceeded for aws ecr image #1400

Jeffwan opened this issue Sep 6, 2021 · 5 comments

Comments

@Jeffwan
Copy link
Member

Jeffwan commented Sep 6, 2021

Seems public ECR has rate limitation as well. Error response from daemon: toomanyrequests: Rate exceeded My env fails to download training image. It would be great to understand the limitation and have a proper configuration?

Events:
  Type     Reason          Age               From               Message
  ----     ------          ----              ----               -------
  Normal   Scheduled       19s               default-scheduler  Successfully assigned kubeflow/training-operator-5f44787bdf-b2cpk to docker-desktop
  Warning  Failed          15s               kubelet            Failed to pull image "public.ecr.aws/j1r0q0g6/training/training-operator:5ef6c405df2bb1bf1d3ede988cd43433eff2e956": rpc error: code = Unknown desc = Error response from daemon: toomanyrequests: Rate exceeded
  Warning  Failed          15s               kubelet            Error: ErrImagePull
  Normal   SandboxChanged  6s (x3 over 15s)  kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal   BackOff         6s (x2 over 11s)  kubelet            Back-off pulling image "public.ecr.aws/j1r0q0g6/training/training-operator:5ef6c405df2bb1bf1d3ede988cd43433eff2e956"
  Warning  Failed          6s (x2 over 11s)  kubelet            Error: ImagePullBackOff
  Normal   Pulling         2s (x2 over 16s)  kubelet            Pulling image "public.ecr.aws/j1r0q0g6/training/training-operator:5ef6c405df2bb1bf1d3ede988cd43433eff2e956"

I can verify the image exists.

➜  mnist_with_summaries git:(sdk) ✗ docker pull public.ecr.aws/j1r0q0g6/training/training-operator:5ef6c405df2bb1bf1d3ede988cd43433eff2e956

5ef6c405df2bb1bf1d3ede988cd43433eff2e956: Pulling from j1r0q0g6/training/training-operator
Digest: sha256:a9a0a1f6b8a399acc5bee43394e252e1a7c8a0d23869e403fc4b834ba268a027
Status: Image is up to date for public.ecr.aws/j1r0q0g6/training/training-operator:5ef6c405df2bb1bf1d3ede988cd43433eff2e956
public.ecr.aws/j1r0q0g6/training/training-operator:5ef6c405df2bb1bf1d3ede988cd43433eff2e956
@Jeffwan
Copy link
Member Author

Jeffwan commented Sep 6, 2021

This just takes more time to get operator up but no other issues

@Jeffwan
Copy link
Member Author

Jeffwan commented Sep 7, 2021

/area enhancement

@google-oss-robot
Copy link

@Jeffwan: The label(s) area/enhancement cannot be applied, because the repository doesn't have them.

In response to this:

/area enhancement

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Jeffwan
Copy link
Member Author

Jeffwan commented Sep 7, 2021

/kind enhancement

@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Mar 2, 2022
@stale stale bot closed this as completed Apr 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants