Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kops - Don't publish the latest-ci marker from periodics #33718

Merged
merged 3 commits into from
Oct 24, 2024

Conversation

rifelpet
Copy link
Member

@rifelpet rifelpet commented Oct 23, 2024

Context

The sequence of Kops version markers is:

  1. The kops-postsubmit-push-to-staging postsubmit job writes to latest-ci.txt (make target, example job log)
Step #1 - "artifacts": gsutil -h "Cache-Control:private, max-age=0, no-transform" cp /workspace/.build/upload/latest.txt gs://k8s-staging-kops/kops/releases/markers/master/latest-ci.txt
  1. The e2e-kops-pipeline-updown-kopsmaster periodic job reads from latest-ci.txt and if creating and tearing down a cluster is successful, writes the same marker to latest-ci-updown-green.txt

--kops-version-marker=https://storage.googleapis.com/k8s-staging-kops/kops/releases/markers/master/latest-ci.txt \
--publish-version-marker=gs://k8s-staging-kops/kops/releases/markers/master/latest-ci-updown-green.txt \

  1. Almost all other periodic jobs then read from latest-ci-updown-green.txt:

if kops_version is None:
kops_deploy_url = marker_updown_green(None)


Problem

The e2e-kops-aws-k8s-latest periodic job has been reading from latest-ci-updown-green.txt and writing to latest-ci.txt:

--kops-version-marker=https://storage.googleapis.com/k8s-staging-kops/kops/releases/markers/master/latest-ci-updown-green.txt \
--publish-version-marker=gs://k8s-staging-kops/kops/releases/markers/master/latest-ci.txt \
--kubernetes-version=https://storage.googleapis.com/k8s-release-dev/ci/latest.txt \

This means there is a race condition with the updown-kopsmaster periodic job, reading latest-ci.txt from either the postsubmit-push-to-staging job or the kops-aws-k8s-latest periodic job. It also means that all of our other e2e jobs can be using outdated artifacts from kops' master branch until a kops PR merge wins the race.


Solution

This PR updates the e2e-kops-aws-k8s-latest job to stop writing to the latest-ci.txt version marker, breaking the race condition. The This version marker is only used by the k/k presubmit job comment is not true because we see the marker is used by the pipeline job that write to latest-ci-updown-green.txt.


I'm also adding a storage class setting to build_jobs.py that was added directly to yaml in #33704

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/config Issues or PRs related to code in /config size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/jobs labels Oct 23, 2024
@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Oct 23, 2024
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 24, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman, rifelpet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit b2e7d7b into kubernetes:master Oct 24, 2024
7 checks passed
@k8s-ci-robot
Copy link
Contributor

@rifelpet: Updated the job-config configmap in namespace default at cluster test-infra-trusted using the following files:

  • key kops-periodics-distros.yaml using file config/jobs/kubernetes/kops/kops-periodics-distros.yaml
  • key kops-periodics-grid.yaml using file config/jobs/kubernetes/kops/kops-periodics-grid.yaml
  • key kops-periodics-misc2.yaml using file config/jobs/kubernetes/kops/kops-periodics-misc2.yaml
  • key kops-periodics-versions.yaml using file config/jobs/kubernetes/kops/kops-periodics-versions.yaml
  • key kops-presubmits-distros.yaml using file config/jobs/kubernetes/kops/kops-presubmits-distros.yaml

In response to this:

Context

The sequence of Kops version markers is:

  1. The kops-postsubmit-push-to-staging postsubmit job writes to latest-ci.txt (make target, example job log)
Step #1 - "artifacts": gsutil -h "Cache-Control:private, max-age=0, no-transform" cp /workspace/.build/upload/latest.txt gs://k8s-staging-kops/kops/releases/markers/master/latest-ci.txt
  1. The e2e-kops-pipeline-updown-kopsmaster periodic job reads from latest-ci.txt and if creating and tearing down a cluster is successful, writes the same marker to latest-ci-updown-green.txt

--kops-version-marker=https://storage.googleapis.com/k8s-staging-kops/kops/releases/markers/master/latest-ci.txt \
--publish-version-marker=gs://k8s-staging-kops/kops/releases/markers/master/latest-ci-updown-green.txt \

  1. Almost all other periodic jobs then read from latest-ci-updown-green.txt:

if kops_version is None:
kops_deploy_url = marker_updown_green(None)


Problem

The e2e-kops-aws-k8s-latest periodic job has been reading from latest-ci-updown-green.txt and writing to latest-ci.txt:

--kops-version-marker=https://storage.googleapis.com/k8s-staging-kops/kops/releases/markers/master/latest-ci-updown-green.txt \
--publish-version-marker=gs://k8s-staging-kops/kops/releases/markers/master/latest-ci.txt \
--kubernetes-version=https://storage.googleapis.com/k8s-release-dev/ci/latest.txt \

This means there is a race condition with the updown-kopsmaster periodic job, reading latest-ci.txt from either the postsubmit-push-to-staging job or the kops-aws-k8s-latest periodic job. It also means that all of our other e2e jobs can be using outdated artifacts from kops' master branch until a kops PR merge wins the race.


Solution

This PR updates the e2e-kops-aws-k8s-latest job to stop writing to the latest-ci.txt version marker, breaking the race condition. The This version marker is only used by the k/k presubmit job comment is not true because we see the marker is used by the pipeline job that write to latest-ci-updown-green.txt.


I'm also adding a storage class setting to build_jobs.py that was added directly to yaml in #33704

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/config Issues or PRs related to code in /config area/jobs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants