Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate jobs away from gs://kubernetes-release-dev #846

Closed
spiffxp opened this issue May 6, 2020 · 23 comments
Closed

Migrate jobs away from gs://kubernetes-release-dev #846

spiffxp opened this issue May 6, 2020 · 23 comments
Assignees
Labels
area/prow Setting up or working with prow in general, prow.k8s.io, prow build clusters area/release-eng Issues or PRs related to the Release Engineering subproject priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Milestone

Comments

@spiffxp
Copy link
Member

spiffxp commented May 6, 2020

There are a number of jobs that assume they have write access to gs://kubernetes-release-dev, and then a bunch of other jobs that assume they should be downloading builds from gs://kubernetes-release-dev

Unfortunately, this bucket will not allow non-google.com service accounts to write to it. We need to use a new bucket, and develop a plan for using it that doesn't involve all jobs being cut over in lock step.

I am pretty sure this overlaps with ongoing work by the release-engineering subproject

ref: #752, #841 - followup to #830

@spiffxp spiffxp changed the title Migrate away from gs://kubernetes-release-dev, gcr.io/kubernetes-ci-images Migrate away from gs://kubernetes-release-dev May 6, 2020
@spiffxp spiffxp changed the title Migrate away from gs://kubernetes-release-dev Migrate jobs away from gs://kubernetes-release-dev May 7, 2020
@spiffxp spiffxp added area/prow Setting up or working with prow in general, prow.k8s.io, prow build clusters sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing. sig/network Categorizes an issue or PR as relevant to SIG Network. wg/k8s-infra area/release-eng Issues or PRs related to the Release Engineering subproject and removed sig/network Categorizes an issue or PR as relevant to SIG Network. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. labels May 7, 2020
@spiffxp
Copy link
Member Author

spiffxp commented May 7, 2020

https://cs.k8s.io/?q=latest-green.txt&i=nope&files=&repos=

It looks like gs://kubernetes-release-dev/ci/latest-green.txt isn't actually consumed by automation

@justaugustus
Copy link
Member

@spiffxp -- Carrying the convo from https://kubernetes.slack.com/archives/CJH2GBF7Y/p1590088139329000...

Yep, we've identified this as a need, but slowed down because the Prow bits weren't in place yet (ref: https://kubernetes.slack.com/archives/CJH2GBF7Y/p1590088139329000).

So, broad strokes/first guess on a plan:

  • We need a new GCP project (shared ownership between SIG Release (Release Engineering), SIG Testing, WG K8s Infra). I don't know who "owns" kubernetes-release-dev today.
  • New GCS bucket
  • insert infra script updates/actuation here
  • Add a new scenario that defaults to publish to the new bucket
  • Create canary ci-kubernetes-build jobs for all branches to start seeding the new bucket
  • Start cutting over some e2e jobs to pull from the canary buckets (maybe some of the release-* e2es?)
  • Migrate the other e2es
  • Flip the ci-kubernetes-build jobs over to point to the new scenario (and turn down the canary build jobs)
  • Flip release tooling to use the new bucket

Again, that's an incredible rough draft, but let me know what you think.

@justaugustus
Copy link
Member

(I should assign myself here as well)
/assign

@justaugustus
Copy link
Member

FYI: @kubernetes/release-engineering @kubernetes/ci-signal

@spiffxp
Copy link
Member Author

spiffxp commented Aug 4, 2020

Are there any strong opinions about:

  • which GCP project the new bucket(s) should live under?
  • what the new bucket(s) should be called?

Existing buckets I want to consider in this context are:

  • gs://kubernetes-release-dev - used by periodics to store release artifacts (google-containers project, only google.com allowed)
  • gs://kubernetes-release-pull - used by presubmits to stage release artifacts for launching e2e tests (kubernetes-jenkins project, allows non-google.com, but eventually we'll need to move away from this too)

@tpepper
Copy link
Member

tpepper commented Aug 4, 2020

I somewhat wish the -pull, -dev buckets mapped directly to the ci-, periodic-, pull- test name pattern, but that's probably non-trivial to resolve?

@spiffxp
Copy link
Member Author

spiffxp commented Aug 4, 2020

Opened #1110 with my proposal

@tpepper
Copy link
Member

tpepper commented Aug 5, 2020

Relative to my comment earlier in the issue and pulling discussion from the PR closer to this comment:

The "stops at a project+SKU" is why I'm proposing something (k8s-release) instead of k8s-artifacts-prod. The dividing line in my head is k8s-artifacts-prod is for final/static release artifacts for every single kubernetes subproject; k8s-release is for all the intermediary stuff (per-pr, postsubmits, nightlies, etc) related to kubernetes

...makes sense to me and explains the "-dev" as synonym/summary name for the ci-, periodic-, pull- tests' artifacts. Dev as meaning not prod is logical.

@spiffxp
Copy link
Member Author

spiffxp commented Jan 22, 2021

kubernetes/test-infra#19483 (comment) - points out there are many references to kubernetes-release-dev across the project

I'd like to scope this issue to just the job configs in kubernetes/test-infra. We can consider this closable once all job configs have migrated to reference k8s-release-dev (except those necessary to maintain kubernetes-release-dev during deprecation window).

@spiffxp
Copy link
Member Author

spiffxp commented Jan 22, 2021

/milestone v1.21

@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Jan 22, 2021
@spiffxp
Copy link
Member Author

spiffxp commented Feb 8, 2021

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Feb 8, 2021
@spiffxp
Copy link
Member Author

spiffxp commented Feb 17, 2021

Opened kubernetes/test-infra#20885

  • adds tests to log whether job configs that use --extract= are pulling from community-owned buckets
  • updated release-blocking tests to use gs://k8s-release, fail on any future release-blocking tests that don't

@spiffxp
Copy link
Member Author

spiffxp commented Mar 17, 2021

kubernetes/test-infra#20964 (comment)

Analysis from a while ago:

  • as long as
    • dl.k8s.io/ci continues to point to kubernetes-release-dev
    • there are parallel jobs pushing to kubernetes-release-dev, k8s-release-dev respectively
    • the build / either job occasionally flake
  • people using dl.k8s.io/ci/*.txt version markers may try pulling something that doesn't exist in k8s-release-dev
  • (and vice-versa, if you use k8s-release-dev/ci/*.txt but pull from kubernetes-release-dev)

We should either sync (one-way), move dl.k8s.io and other things aggressively (post-v1.21), or make it really really clear when and why this happens

@spiffxp
Copy link
Member Author

spiffxp commented Apr 15, 2021

/milestone v1.22

@spiffxp
Copy link
Member Author

spiffxp commented Jul 2, 2021

Opened #2291 with a script I've been using to keep an eye on the delta between gs://k8s-release-dev and gs://kubernetes-release-dev

@spiffxp
Copy link
Member Author

spiffxp commented Jul 9, 2021

I'm going to consider this closed if kubernetes/test-infra#22840 lands without breaking anything

I've opened #2318 as the umbrella issue to track "what else besides jobs"

@spiffxp
Copy link
Member Author

spiffxp commented Jul 17, 2021

/close
We caused some breakage but at this point I think anything remaining that nobody's noticed yet can be tracked in the umbrella issue.

I have a timeline of what landed / broke / fixed when: #2318 (comment)

@k8s-ci-robot
Copy link
Contributor

@spiffxp: Closing this issue.

In response to this:

/close
We caused some breakage but at this point I think anything remaining that nobody's noticed yet can be tracked in the umbrella issue.

I have a timeline of what landed / broke / fixed when: #2318 (comment)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/prow Setting up or working with prow in general, prow.k8s.io, prow build clusters area/release-eng Issues or PRs related to the Release Engineering subproject priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
Development

No branches or pull requests

5 participants