Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Roadmap and Vision #1529

Merged
merged 6 commits into from
Aug 5, 2021
Merged

Conversation

saschagrunert
Copy link
Member

@saschagrunert saschagrunert commented Apr 20, 2021

What type of PR is this:

/kind documentation

What this PR does / why we need it:

This PR adds the SIG Release Roadmap and Vision for 2021 and beyond.

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

/priority important-soon
/cc @kubernetes/sig-release-leads
/hold for discussion and review

Referring mailing list discussion: https://groups.google.com/g/kubernetes-sig-release/c/YTessa2-Kow/m/bFy_9bEMAQAJ
Should merge after the cadence KEP: kubernetes/enhancements#2567

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/documentation Categorizes issue or PR as related to documentation. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/release Categorizes an issue or PR as relevant to SIG Release. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 20, 2021
Comment on lines 129 to 131
1. **SIG Cluster Lifecycle**

To get input for making Cluster API a first-class signal for upstream releases.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in addition to or replacing current jobs?

Copy link
Member Author

@saschagrunert saschagrunert Apr 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition. Maybe @kubernetes/sig-cluster-lifecycle can provide additional feedback on top of that point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a SIG CL proposal:
kubernetes/kubernetes#82532
"replace the existing PR and release blocking kube-up based jobs with CAPI jobs"

IIRC, SIG Testing and Release had some agreement on this, but i don't think it was discussed with other SIGs.

Copy link
Member

@xmudrii xmudrii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left two comments, but other than that, this looks great! 💯

Comment on lines 91 to 95
1. **Moving deb/rpm package builds to community infrastructure (large)**
- _What:_ The reb/rpm packages are built by the Google build admin, we should
move that to community infrastructure and build them automatically.
- _Why:_ It’s easier for us to fix issues independently that way and we have
more control about how the packages are being built.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this out of scope? In Primary focus, we say that we want to be independent from external companies like Google. IMO, this is probably the biggest dependency and it brings many problems that would be easier to solve if we had our own deb/rpm infra.


1. **SIG Cluster Lifecycle**

To get input for making Cluster API a first-class signal for upstream releases.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it would be useful to clarify what would this improve and what we gain from it.

Outcome: A documented and simple process for handling CVE information within
Kubernetes releases.

1. **Establish Cluster API as first-class signal for upstream releases
Copy link
Member

@aojea aojea Apr 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this job https://testgrid.k8s.io/sig-cluster-lifecycle-cluster-api-provider-gcp#capg-conformance-v1alpha4-k8s-master&width=5 and there are several things to take into consideration:

  1. it runs a script https://github.com/kubernetes-sigs/cluster-api-provider-gcp/blob/master/scripts/ci-conformance.sh
    that install KIND
  2. it calls another script that creates some gcp infra with bash command "gcloud ..." and install packer https://github.com/kubernetes-sigs/cluster-api-provider-gcp/blob/master/scripts/ci-e2e.sh
  3. packer runs some ansible based on https://github.com/kubernetes-sigs/image-builder/tree/master/images/capi
  4. the it runs kind and does its stuff

It installs calico from master?

Deploy calico
kubectl --kubeconfig="/tmp/kubeconfig" apply -f https://docs.projectcalico.org/manifests/calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created

I see many moving parts, from my experience it will take a lot to stabilise this job ... I only ask to not replace current jobs until this has demonstrated stability for at least one release cycle

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections to improving support for this mechanism. I agree with @aojea that the current mechanisms should remain until the new approach demonstrates stability for at least a release cycle. As an aside, I would have guessed this would have fallen under "Consumable", not sure how it relates to "Secure".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed recently with SIG Scale, the GCP provider for CAPI (CAPG) is not in a good state due to no active maintainers, that can dedicate a certain % of workday to the project.
the existing CI scripts and setup for CAPG are a best effort currently.

north-star-vision.md Outdated Show resolved Hide resolved
We're not completely aware of all technical aspects for the changes. This
means that there is a risk of delaying because of investing more time in
pre-research.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be the decision taking in some of these topics a problem? Do we expect strong veto or not identified stakeholder going against some work?

@lasomethingsomething
Copy link
Contributor

Some of the comments posted so far suggest the need for greater clarification about the prioritisation and the "why" driving these items: Why are the outcomes listed here the most important, highest-impact ones? Why are some items lower down the list than others?

One (but not the only! :) ) way to tackle this would be to run an action priority matrix exercise during an upcoming SIG Release meeting. Then 1-2 sentences clarifying the impact/why could be arrived at by the group and included in this statement for others.


1. **Enhance Kubernetes binary artifact management (Consumable)**

Outcome: Being able to promote files as artifacts and using this mechanism
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a linked issue or more details for what this means?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one is the epic: #1372

cc @justaugustus @LappleApple if you have others in mind.

Outcome: A documented and simple process for handling CVE information within
Kubernetes releases.

1. **Establish Cluster API as first-class signal for upstream releases
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections to improving support for this mechanism. I agree with @aojea that the current mechanisms should remain until the new approach demonstrates stability for at least a release cycle. As an aside, I would have guessed this would have fallen under "Consumable", not sure how it relates to "Secure".

consumption easier. This includes being process independent from external
companies like Google.
1. **Introspectable**: It is clear for users at which point and how Kubernetes
artifacts are being built. This includes the documentation of all
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are reproducible build artifacts included as part of this? I don't see them mentioned below. That has been a user-facing issue in the past (e.g. not being able to restart/recover from an issue during the release process and having to cut additional tags)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it's not part of the roadmap. I see that kubernetes/kubernetes#70131 has been closed, but there is an open question at the end of the conversation. Maybe we want to add this topic as well. Thoughts on that @kubernetes/release-team-leads ?

1. **Consumable**: Improving the usability of artifacts by making their
consumption easier. This includes being process independent from external
companies like Google.
1. **Introspectable**: It is clear for users at which point and how Kubernetes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "north star" just "documentation", or should it be a hermetic build process that is impervious to human interference, for all artifacts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's both: For example if we speak about package builds, then we want to completely automate that topic away from human interaction. On the other hand if we looks at the CVE process, then this will most of the time require human interaction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me that, even for a CVE, a human needs to touch the built aartifacts, but I guess that path is open to discussion.

What I'd like to see as a strong "north star" statement is something like what I wrote - a hermetic build process that is impervious (or at least detectable) to human interference, for all official release artifacts including the official builds but also all of the container images.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I added the following statement:

All official release artifacts will be built by a hermetic process that is impervious to human interference.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@saschagrunert
Copy link
Member Author

One (but not the only! :) ) way to tackle this would be to run an action priority matrix exercise during an upcoming SIG Release meeting. Then 1-2 sentences clarifying the impact/why could be arrived at by the group and included in this statement for others.

I moved every topic to be in-scope for now. I think we should prioritize them within the SIG Release meeting. Unfortunately the next meeting will be May 18, because the week of May 3rd is KubeCon.

north-star-vision.md Outdated Show resolved Hide resolved

Outcome: Cluster API provides a CI signal for blocking release test jobs.

1. **Enhance and simplify Kubernetes version markers (Consumable)**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have already have some issue for this item?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May @LappleApple knows more, I don't think that we have an umbrella issue for this one yet.

Some refs: kubernetes/release#1693, kubernetes/release#1711

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@saschagrunert saschagrunert changed the title Add North Star Vision Add Roadmap and Vision Apr 23, 2021
roadmap.md Outdated Show resolved Hide resolved
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@tpepper
Copy link
Member

tpepper commented May 6, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 6, 2021
@justaugustus
Copy link
Member

/hold will review the edits next week

Outcome: An automated formal verification of produced release artifacts for
every future release.

1. **Enhance Kubernetes binary artifact management (Consumable)**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear whether the term "binary artifact" as used here encompasses or excludes images and packages. The reference to #1372 suggests it encompasses them, but the following outcome suggests it only encompasses kubernetes/enhancements#1732, excluding kubernetes/enhancements#1734 and kubernetes/enhancements#1731

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linked EPIC covers files, container images as well as deb/rpm packages. We probably wanna remove the word "binary" from the deliverable.

Outcome: An automated formal verification of produced release artifacts for
every future release.

1. **Enhance Kubernetes binary artifact management (Consumable)**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **Enhance Kubernetes binary artifact management (Consumable)**
1. **Enhance Kubernetes artifact management (Consumable)**


https://github.com/kubernetes/sig-release/issues/1372

Outcome: Being able to promote files as artifacts and using this mechanism
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Outcome: Being able to promote files as artifacts and using this mechanism
Outcome: Being able to promote artifacts and using this mechanism

@jeremyrickard
Copy link
Contributor

@justaugustus take a final look

@jeremyrickard
Copy link
Contributor

I took a second review of this and it looks good to me. I think we've gotten to the core of the document and we can handle any other things with follow up PRs, what do you think @saschagrunert ? I'll defer to you on dropping the hold if so.

/lgtm
/approve

@justaugustus
Copy link
Member

/lgtm
/approve
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 5, 2021
@k8s-ci-robot k8s-ci-robot merged commit 400b212 into kubernetes:master Aug 5, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Aug 5, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jeremyrickard, justaugustus, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [jeremyrickard,justaugustus,saschagrunert]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@saschagrunert saschagrunert deleted the north-star branch August 6, 2021 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/documentation Categorizes issue or PR as related to documentation. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/release Categorizes an issue or PR as relevant to SIG Release. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.