Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-v0.41.x] picking up latest changes in knative 1.8 #6201

Merged

Conversation

pritidesai
Copy link
Member

@pritidesai pritidesai commented Feb 21, 2023

Changes

This PR is a result of:

go get knative.dev/pkg@release-1.8
./hack/update-codegen.sh

knative.dev/pkg was recently updated to include a performance fix which helped reduced the CPU usage by 61% and memory usage by 44% for huge pipelines. This PR is updating the tekton controllers to include that fix such that our LTS release can take advantage of this performance improvement.

The same fix was cherry picked in knative.dev/pkg 1.9. Tekton pipeline was updated to use knative 1.9 post 0.45 in PR #6062. We have another PR #6194 open to update pipeline controllers in the main branch.

/kind misc

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Has Docs included if any changes are user facing
  • Has Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including
    functionality, content, code)
  • Has a kind label. You can add one by adding a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings)
  • Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

Bring the latest performance fix in knative/pkg which helps reduce CPU/Memory usage for huge pipelines.

knative.dev/pkg was recently updated to include a performance fix which
helped reduced the CPU usage by 61% and memory usage by 44% for huge
pipelines. This commit is updating the tekton controllers to include
that fix such that our LTS release can take advantage of this
performance improvement.

The same fix was cherry picked in knative.dev/pkg 1.9. Tekton pipeline
was updated to use knative 1.9 post 0.45. We have another PR
tektoncd#6194 open to update pipeline
controllers.

Signed-off-by: pritidesai <pdesai@us.ibm.com>
@tekton-robot tekton-robot added kind/misc Categorizes issue or PR as a miscellaneuous one. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Feb 21, 2023
@tekton-robot tekton-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 21, 2023
@pritidesai
Copy link
Member Author

/retest

@vdemeester
Copy link
Member

/retest
/approve

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 22, 2023
@lbernick
Copy link
Member

/retest
/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 22, 2023
@JeromeJu
Copy link
Member

/test pull-tekton-pipeline-beta-integration-tests

@pritidesai
Copy link
Member Author

the beta test failure is constant in all the attempts so far 😞:

Default change: During creation of nodepools or autoscaling configuration changes for cluster versions greater than 1.24.1-gke.800 a default location policy is applied. For Spot and PVM it defaults to ANY, and for all other VM kinds a BALANCED policy is used. To change the default values use the `--location-policy` flag.
Note: Your Pod address range (`--cluster-ipv4-cidr`) can accommodate at most 1008 node(s).
ERROR: (gcloud.beta.container.clusters.create) ResponseError: code=400, message=
	(1) Creation of node pools using node images based on Docker container runtimes is not supported in GKE v1.23. This is to prepare for the removal of Dockershim in Kubernetes v1.24. We recommend that you migrate to image types based on Containerd (examples). For more information, contact Cloud Support
	(2) Creation of node pools using node images based on Docker container runtimes is not supported in GKE v1.24+ clusters as Dockershim has been removed in Kubernetes v1.24.
2023/02/22 16:38:20 process.go:155: Step 'gcloud beta container clusters create --quiet --enable-autoscaling --min-nodes=1 --max-nodes=3 --scopes=cloud-platform --no-issue-client-certificate --project=tekton-prow-7 --region=us-central1 --machine-type=n1-standard-4 --image-type=cos --num-nodes=1 --network=tpipeline-e2e-net1628433604[252](https://prow.tekton.dev/view/gs/tekton-prow/pr-logs/pull/tektoncd_pipeline/6201/pull-tekton-pipeline-beta-integration-tests/1628433604252012544#1:build-log.txt%3A252)01[254](https://prow.tekton.dev/view/gs/tekton-prow/pr-logs/pull/tektoncd_pipeline/6201/pull-tekton-pipeline-beta-integration-tests/1628433604252012544#1:build-log.txt%3A254)4 --cluster-version=1.24.9-gke.2000 tpipeline-e2e-cls1628433604252012544' finished in 2.8496[255](https://prow.tekton.dev/view/gs/tekton-prow/pr-logs/pull/tektoncd_pipeline/6201/pull-tekton-pipeline-beta-integration-tests/1628433604252012544#1:build-log.txt%3A255)75s
2023/02/22 16:38:20 main.go:319: Something went wrong: starting e2e cluster: error creating cluster: error during gcloud beta container clusters create --quiet --enable-autoscaling --min-nodes=1 --max-nodes=3 --scopes=cloud-platform --no-issue-client-certificate --project=tekton-prow-7 --region=us-central1 --machine-type=n1-standard-4 --image-type=cos --num-nodes=1 --network=tpipeline-e2e-net1628433604252012544 --cluster-version=1.24.9-gke.2000 tpipeline-e2e-cls1628433604252012544: exit status 1

I am having hard time deciphering the failure but looks like a cluster is being created with 1.24, could that be an issue? 🤔

Also, I checked two latest commits in release-v0.41.x and noticed the beta tests were not part of the CI, wondering why?

@linux-foundation-easycla[bot]
EasyCLA EasyCLA check passed. You are authorized to contribute.
Details
@tekton-robot
check-github-tasks-completed Job Successful.
Details
@tekton-robot
check-pr-has-kind-label Job Successful.
Details
@tekton-robot
pull-tekton-pipeline-alpha-integration-tests Job succeeded.                     BaseSHA:9ee839025241dcaf1dc3b9086e3c5fdb0c4126c1
Details
@tekton-robot
pull-tekton-pipeline-build-tests Job succeeded.                     BaseSHA:9ee839025241dcaf1dc3b9086e3c5fdb0c4126c1
Details
@tekton-robot
pull-tekton-pipeline-go-coverage Job succeeded.                     BaseSHA:9ee839025241dcaf1dc3b9086e3c5fdb0c4126c1
Details
@tekton-robot
pull-tekton-pipeline-integration-tests Job succeeded.                     BaseSHA:9ee839025241dcaf1dc3b9086e3c5fdb0c4126c1
Details
@tekton-robot
pull-tekton-pipeline-unit-tests Job succeeded.                     BaseSHA:9ee839025241dcaf1dc3b9086e3c5fdb0c4126c1
Details
@tekton-robot
tide In merge pool.

@JeromeJu
Copy link
Member

the beta test failure is constant in all the attempts so far 😞:

Default change: During creation of nodepools or autoscaling configuration changes for cluster versions greater than 1.24.1-gke.800 a default location policy is applied. For Spot and PVM it defaults to ANY, and for all other VM kinds a BALANCED policy is used. To change the default values use the `--location-policy` flag.
Note: Your Pod address range (`--cluster-ipv4-cidr`) can accommodate at most 1008 node(s).
ERROR: (gcloud.beta.container.clusters.create) ResponseError: code=400, message=
	(1) Creation of node pools using node images based on Docker container runtimes is not supported in GKE v1.23. This is to prepare for the removal of Dockershim in Kubernetes v1.24. We recommend that you migrate to image types based on Containerd (examples). For more information, contact Cloud Support
	(2) Creation of node pools using node images based on Docker container runtimes is not supported in GKE v1.24+ clusters as Dockershim has been removed in Kubernetes v1.24.
2023/02/22 16:38:20 process.go:155: Step 'gcloud beta container clusters create --quiet --enable-autoscaling --min-nodes=1 --max-nodes=3 --scopes=cloud-platform --no-issue-client-certificate --project=tekton-prow-7 --region=us-central1 --machine-type=n1-standard-4 --image-type=cos --num-nodes=1 --network=tpipeline-e2e-net1628433604[252](https://prow.tekton.dev/view/gs/tekton-prow/pr-logs/pull/tektoncd_pipeline/6201/pull-tekton-pipeline-beta-integration-tests/1628433604252012544#1:build-log.txt%3A252)01[254](https://prow.tekton.dev/view/gs/tekton-prow/pr-logs/pull/tektoncd_pipeline/6201/pull-tekton-pipeline-beta-integration-tests/1628433604252012544#1:build-log.txt%3A254)4 --cluster-version=1.24.9-gke.2000 tpipeline-e2e-cls1628433604252012544' finished in 2.8496[255](https://prow.tekton.dev/view/gs/tekton-prow/pr-logs/pull/tektoncd_pipeline/6201/pull-tekton-pipeline-beta-integration-tests/1628433604252012544#1:build-log.txt%3A255)75s
2023/02/22 16:38:20 main.go:319: Something went wrong: starting e2e cluster: error creating cluster: error during gcloud beta container clusters create --quiet --enable-autoscaling --min-nodes=1 --max-nodes=3 --scopes=cloud-platform --no-issue-client-certificate --project=tekton-prow-7 --region=us-central1 --machine-type=n1-standard-4 --image-type=cos --num-nodes=1 --network=tpipeline-e2e-net1628433604252012544 --cluster-version=1.24.9-gke.2000 tpipeline-e2e-cls1628433604252012544: exit status 1

I am having hard time deciphering the failure but looks like a cluster is being created with 1.24, could that be an issue? 🤔

Also, I checked two latest commits in release-v0.41.x and noticed the beta tests were not part of the CI, wondering why?

@linux-foundation-easycla[bot]
EasyCLA EasyCLA check passed. You are authorized to contribute.
Details
@tekton-robot
check-github-tasks-completed Job Successful.
Details
@tekton-robot
check-pr-has-kind-label Job Successful.
Details
@tekton-robot
pull-tekton-pipeline-alpha-integration-tests Job succeeded.                     BaseSHA:9ee839025241dcaf1dc3b9086e3c5fdb0c4126c1
Details
@tekton-robot
pull-tekton-pipeline-build-tests Job succeeded.                     BaseSHA:9ee839025241dcaf1dc3b9086e3c5fdb0c4126c1
Details
@tekton-robot
pull-tekton-pipeline-go-coverage Job succeeded.                     BaseSHA:9ee839025241dcaf1dc3b9086e3c5fdb0c4126c1
Details
@tekton-robot
pull-tekton-pipeline-integration-tests Job succeeded.                     BaseSHA:9ee839025241dcaf1dc3b9086e3c5fdb0c4126c1
Details
@tekton-robot
pull-tekton-pipeline-unit-tests Job succeeded.                     BaseSHA:9ee839025241dcaf1dc3b9086e3c5fdb0c4126c1
Details
@tekton-robot
tide In merge pool.

The beta integration test was introduced and held to be merged after V1 CRD release, which could be the reason why it was not in v0.41.

@JeromeJu
Copy link
Member

As for the cluster versioning tektoncd/plumbing#1348 and tektoncd/plumbing#1332 should be both in the last release, which I am not sure if that should be the case? cc @XinruZhang @vdemeester 🤔

@pritidesai
Copy link
Member Author

One more thing I noticed, we are running go 1.19:

go version
go version go1.19 linux/amd64

This might not be causing issue here but our pipelines repo is compliant with 1.18 as par go.mod.

@pritidesai
Copy link
Member Author

An interesting comparison of logs from beta tests in this PR and one of the recent PRs:

Screen Shot 2023-02-22 at 12 06 46 PM

Beta tests from this PR is missing setting up the KIND cluster and running ./test/presubmit-tests.sh --integration-tests instead of running test/e2e-tests.sh.

@XinruZhang
Copy link
Member

@pritidesai

ERROR: (gcloud.beta.container.clusters.create) ResponseError: code=400, message=
	(1) Creation of node pools using node images based on Docker container runtimes is not supported in GKE v1.23. This is to prepare for the removal of Dockershim in Kubernetes v1.24. We recommend that you migrate to image types based on Containerd (examples). For more information, contact Cloud Support
	(2) Creation of node pools using node images based on Docker container runtimes is not supported in GKE v1.24+ clusters as Dockershim has been removed in Kubernetes v1.24.

The error comes from the param we specified during the container creation --image-type=cos when creating the cluster using the following command, but not sure why the error only occurs in this PR

gcloud beta container clusters create --quiet --enable-autoscaling --min-nodes=1 --max-nodes=3 --scopes=cloud-platform --no-issue-client-certificate --project=tekton-prow-7 --region=us-central1 --machine-type=n1-standard-4 --num-nodes=1 --cluster-version=1.24.9-gke.2000 tpipeline-e2e

@pritidesai
Copy link
Member Author

thanks @XinruZhang 👍

I am also wondering why is the GKE being provisioned instead of KIND?

@pritidesai
Copy link
Member Author

pritidesai commented Feb 22, 2023

Could it be because this beta env. setting is missing from this branch (release-v0.41.x)?

@pritidesai
Copy link
Member Author

Do we need to cherry pick this PR which added prow env. for beta tests?

@pritidesai
Copy link
Member Author

Could it be because this beta env. setting is missing from this branch (release-v0.41.x)?

It was introduced in 0.45 and not part of branches from release-v0.41.x through release-v0.44.x. Just checked the latest commit in release-v0.44.x, the beta test did not run on that branch.

@pritidesai
Copy link
Member Author

One more follow up PR which we might have to cherry pick? #6031 🤔

@XinruZhang
Copy link
Member

Wondering if there's other modifications we need to cherrypick to ensure our CI works for previous releases. 🤔

@XinruZhang
Copy link
Member

It looks llike we are good to go based on the comparison between this branch with the latest branch: https://gist.github.com/XinruZhang/8a631aaa69457d53b96ce71ca1d5c428

@pritidesai
Copy link
Member Author

It looks llike we are good to go based on the comparison between this branch with the latest branch: https://gist.github.com/XinruZhang/8a631aaa69457d53b96ce71ca1d5c428

Cherrypick PR #6031 and #5737? Any other PRs I am missing to cherrypick?

@pritidesai
Copy link
Member Author

It looks llike we are good to go based on the comparison between this branch with the latest branch: https://gist.github.com/XinruZhang/8a631aaa69457d53b96ce71ca1d5c428

oh no, the gist has many more changes 😞 but the larger results was introduced in 0.43 which does not need to be cherrypicked. Also, we do need to cherrypick one more PR - #5726

@pritidesai
Copy link
Member Author

So to make 0.41 releasable, we need to cherrypick the following PRs:

Please correct me if I am wrong or add anything I am missing.

@XinruZhang
Copy link
Member

XinruZhang commented Feb 22, 2023

I wonder do we need to run pull-tekton-pipeline-beta-integration-tests for the releases before v0.43.x? Because this is introduced tektoncd/plumbing#1256 for testing beta features after V1 CRDs are released.

cc @tektoncd/core-collaborators @lbernick @JeromeJu

@JeromeJu
Copy link
Member

I wonder do we need to run pull-tekton-pipeline-beta-integration-tests for the releases before v0.43.x? Because this is introduced tektoncd/plumbing#1256 for testing beta features after V1 CRDs are released.

cc @tektoncd/core-collaborators @lbernick @JeromeJu

I think that we should not what it does should be only for those integration tests that require beta gates, so should not be a problem even if it is ran against them 🤔

@pritidesai
Copy link
Member Author

Yup, I have been trying to see if it's possible to disable beta integration tests before 0.43 (or the release when v1 CRDs were introduced). There should not be any harm in running beta but its equivalent to running pull-tekton-pipeline-integration-tests and does not add any value. 0.41 is LTS which makes it crucial to have a working CI for that branch. I will cherrypick those three PRs and we can take it from there. Thoughts?

@XinruZhang
Copy link
Member

Yup, I have been trying to see if it's possible to disable beta integration tests before 0.43 (or the release when v1 CRDs were introduced). There should not be any harm in running beta but its equivalent to running pull-tekton-pipeline-integration-tests and does not add any value. 0.41 is LTS which makes it crucial to have a working CI for that branch. I will cherrypick those three PRs and we can take it from there. Thoughts?

SGTM! Thank you @pritidesai !

@pritidesai
Copy link
Member Author

Hey @JeromeJu @XinruZhang @vdemeester @lbernick, I have a separate PR opened to fix the CI failure for the release-v0.41.x branch: #6213

The beta tests succeeded in that PR, hoping to see the rest of the checks succeed as well 🤞

@pritidesai
Copy link
Member Author

/retest

@tekton-robot tekton-robot merged commit 1817f00 into tektoncd:release-v0.41.x Feb 23, 2023
@pritidesai pritidesai deleted the bring-latest-knative-1.8 branch February 23, 2023 19:45
pritidesai added a commit to pritidesai/pipeline that referenced this pull request Feb 23, 2023
In PR tektoncd#6201, we identified a failure with our CI system. Beta integrations tests
are failing in PR tektoncd#6201. Our CI system has been updated to run beta tests but
the code base in release-v0.42.x was not updated with necessary changes to
run beta tests successfully. We discovered a list of PRs which needs to be
cherry picked into release-0.42.x branch for the tests to work.

Cherry picking the following PRs:

[upgrade test] Change to Kind cluster and Unfixed upgrade test release version tektoncd#5726
Add prow env for beta integration test tektoncd#5737
Beta Example Tests tektoncd#6031

Signed-off-by: pritidesai <pdesai@us.ibm.com>
pritidesai added a commit to pritidesai/pipeline that referenced this pull request Feb 23, 2023
In PR tektoncd#6201, we identified a failure with our CI system. Beta integrations tests
are failing in PR tektoncd#6201. Our CI system has been updated to run beta tests but
the code base in release-v0.42.x was not updated with necessary changes to
run beta tests successfully. We discovered a list of PRs which needs to be
cherry picked into release-0.42.x branch for the tests to work.

Cherry picking the following PRs:

[upgrade test] Change to Kind cluster and Unfixed upgrade test release version tektoncd#5726
Add prow env for beta integration test tektoncd#5737
Beta Example Tests tektoncd#6031

Signed-off-by: pritidesai <pdesai@us.ibm.com>
pritidesai added a commit to pritidesai/pipeline that referenced this pull request Feb 23, 2023
In PR tektoncd#6201, we identified a failure with our CI system. Beta integrations tests
are failing in PR tektoncd#6201. Our CI system has been updated to run beta tests but
the code base in release-v0.42.x was not updated with necessary changes to
run beta tests successfully. We discovered a list of PRs which needs to be
cherry picked into release-0.42.x branch for the tests to work.

Cherry picking the following PRs:

[upgrade test] Change to Kind cluster and Unfixed upgrade test release version tektoncd#5726
Add prow env for beta integration test tektoncd#5737
Beta Example Tests tektoncd#6031

Signed-off-by: pritidesai <pdesai@us.ibm.com>
pritidesai added a commit to pritidesai/pipeline that referenced this pull request Feb 23, 2023
In PR tektoncd#6201, we identified a failure with our CI system. Beta integrations tests
are failing in PR tektoncd#6201. Our CI system has been updated to run beta tests but
the code base in release-v0.44.x was not updated with necessary changes to
run beta tests successfully. We discovered a list of PRs which needs to be
cherry picked into release-0.44.x branch for the tests to work.

Cherry picking the following PRs:

[upgrade test] Change to Kind cluster and Unfixed upgrade test release version tektoncd#5726
Add prow env for beta integration test tektoncd#5737
Beta Example Tests tektoncd#6031

Running update-codegen using Go 1.19

Signed-off-by: pritidesai <pdesai@us.ibm.com>
pritidesai added a commit to pritidesai/pipeline that referenced this pull request Feb 23, 2023
In PR tektoncd#6201, we identified a failure with our CI system. Beta integrations tests
are failing in PR tektoncd#6201. Our CI system has been updated to run beta tests but
the code base in release-v0.44.x was not updated with necessary changes to
run beta tests successfully. We discovered a list of PRs which needs to be
cherry picked into release-0.44.x branch for the tests to work.

Cherry picking the following PRs:

[upgrade test] Change to Kind cluster and Unfixed upgrade test release version tektoncd#5726
Add prow env for beta integration test tektoncd#5737
Beta Example Tests tektoncd#6031

Running update-codegen using Go 1.19

Signed-off-by: pritidesai <pdesai@us.ibm.com>
pritidesai added a commit to pritidesai/pipeline that referenced this pull request Feb 24, 2023
In PR tektoncd#6201, we identified a failure with our CI system. Beta integrations tests
are failing in PR tektoncd#6201. Our CI system has been updated to run beta tests but
the code base in release-v0.44.x was not updated with necessary changes to
run beta tests successfully. We discovered a list of PRs which needs to be
cherry picked into release-0.44.x branch for the tests to work.

Cherry picking the following PRs:

[upgrade test] Change to Kind cluster and Unfixed upgrade test release version tektoncd#5726
Add prow env for beta integration test tektoncd#5737
Beta Example Tests tektoncd#6031

Running update-codegen using Go 1.19

Signed-off-by: pritidesai <pdesai@us.ibm.com>
tekton-robot pushed a commit that referenced this pull request Feb 27, 2023
In PR #6201, we identified a failure with our CI system. Beta integrations tests
are failing in PR #6201. Our CI system has been updated to run beta tests but
the code base in release-v0.42.x was not updated with necessary changes to
run beta tests successfully. We discovered a list of PRs which needs to be
cherry picked into release-0.42.x branch for the tests to work.

Cherry picking the following PRs:

[upgrade test] Change to Kind cluster and Unfixed upgrade test release version #5726
Add prow env for beta integration test #5737
Beta Example Tests #6031

Signed-off-by: pritidesai <pdesai@us.ibm.com>
tekton-robot pushed a commit that referenced this pull request Feb 27, 2023
In PR #6201, we identified a failure with our CI system. Beta integrations tests
are failing in PR #6201. Our CI system has been updated to run beta tests but
the code base in release-v0.44.x was not updated with necessary changes to
run beta tests successfully. We discovered a list of PRs which needs to be
cherry picked into release-0.44.x branch for the tests to work.

Cherry picking the following PRs:

[upgrade test] Change to Kind cluster and Unfixed upgrade test release version #5726
Add prow env for beta integration test #5737
Beta Example Tests #6031

Running update-codegen using Go 1.19

Signed-off-by: pritidesai <pdesai@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/misc Categorizes issue or PR as a miscellaneuous one. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants