Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cip: fix fatal bug in SplitByKnownRegistries #188

Closed
listx opened this issue Mar 5, 2020 · 1 comment · Fixed by #189
Closed

cip: fix fatal bug in SplitByKnownRegistries #188

listx opened this issue Mar 5, 2020 · 1 comment · Fixed by #189
Assignees

Comments

@listx
Copy link
Contributor

listx commented Mar 5, 2020

@thockin TL;DR: Allowing images to be pushed to the root directory currently breaks the promoter. Fix on the way...!

Pasted error log from ci-k8sio-cip run from earlier today:

I0304 21:47:40.563384      13 inventory.go:1615] Request {gcr.io/k8s-staging-cluster-api/kubeadm-bootstrap-controller   true}: OK
I0304 21:47:40.566394      13 inventory.go:1615] Request {us.gcr.io/k8s-artifacts-prod/resource_consumer/controller k8s-infra-gcr-promoter@k8s-artifacts-prod.iam.gserviceaccount.com  false}: OK
I0304 21:47:40.605095      13 inventory.go:1615] Request {gcr.io/google-containers/skydns-ppc64le   true}: OK
I0304 21:47:40.629320      13 inventory.go:1615] Request {us.gcr.io/k8s-artifacts-prod/federation-controller-manager-amd64 k8s-infra-gcr-promoter@k8s-artifacts-prod.iam.gserviceaccount.com  false}: OK
panic: runtime error: index out of range [0] with length 0
goroutine 162 [running]:
sigs.k8s.io/k8s-container-image-promoter/lib/dockerregistry.SplitByKnownRegistries(0xc006ecf590, 0x2f, 0xc000d74000, 0x6c, 0x92, 0x0, 0x0, 0x1, 0xd, 0x0, ...)
	lib/dockerregistry/inventory.go:1491 +0x326
sigs.k8s.io/k8s-container-image-promoter/lib/dockerregistry.(*SyncContext).ReadRegistries.func2(0xc0002ca000, 0xc0000c8240, 0xc00521b4a0, 0xc00113c800, 0xc00113c730)
	lib/dockerregistry/inventory.go:1282 +0xafa
created by sigs.k8s.io/k8s-container-image-promoter/lib/dockerregistry.(*SyncContext).ExecRequests
	lib/dockerregistry/inventory.go:1620 +0x137

(see https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-k8sio-cip/1235320791898263552)

This happens inside the SplitByKnownRegistries(unparsed_name, prefixes) function, iff the registry name and the full image name path (to be split into the registry + image name) are the same.

I've reproduced this locally, and understand why this is happening. Previously (before the backfill manifests were merged, this has never happened because we've never had to read images that have an unparsed_name that matches a prefix.

Let's take an example. Currently the us.gcr.io/k8s-artifacts-prod/kube-state-metrics repo has some images in it as a result of the backfill. However, there is also a promoter manifest that names us.gcr.io/k8s-artifacts-prod/kube-state-metrics as a destination repository. Normally, new images under this repository would end up as us.gcr.io/k8s-artifacts-prod/kube-state-metrics/<NEW_IMAGE>. And so normally, this would end up calling SplitByKnownRegistries with us.gcr.io/k8s-artifacts-prod/kube-state-metrics/<NEW_IMAGE> as the unparsed_name and us.gcr.io/k8s-artifacts-prod/kube-state-metrics as one of the prefixes to parse with.

However, ever since the partial backfilling that happened yesterday, there are now images that rest in this same path --- as of the time of this writing, for example, this image:

us.gcr.io/k8s-artifacts-prod/kube-state-metrics:v1.4.0

And so when we read the path us.gcr.io/k8s-artifacts-prod/kube-state-metrics today, we end up calling SplitByKnownRegistries with this name that matches one of the prefixes with the same name, us.gcr.io/k8s-artifacts-prod/kube-state-metrics. When kubernetes/k8s.io#623 was initially merged, we did not see the same error because no root-level images had been populated yet. We saw the error today in https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-k8sio-cip/1235320791898263552 for the first time because some images like us.gcr.io/k8s-artifacts-prod/kube-state-metrics:v1.4.0 were created due to the partial backfill.

The fix should be simple --- if the image name matches a known prefix, just take everything after the last '/' character.

Related context:

@listx
Copy link
Contributor Author

listx commented Mar 5, 2020

I noticed that the SplitByKnownRegistries function does not have a corresponding unit test. Fixing that now.

listx pushed a commit to listx/k8s-container-image-promoter that referenced this issue Mar 5, 2020
listx pushed a commit to listx/k8s-container-image-promoter that referenced this issue Mar 5, 2020
listx pushed a commit to listx/test-infra that referenced this issue Mar 5, 2020
Most notably, this brings in a promoter "cip" version that shouldn't
choke on images at the top level of repositories:
kubernetes-sigs/promo-tools#188.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant