Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify if pipeline works after upgrading from previous release to current release 🧗‍♂️ #1162

Merged
merged 2 commits into from
Aug 30, 2019
Merged

Conversation

houshengbo
Copy link

@houshengbo houshengbo commented Aug 5, 2019

Changes

This PR adds the upgrade test case to install the previous release, upgrade to the current release, and run the integration tests.
Partially-closes: #1114

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

See the contribution guide for more details.

Double check this list of stuff that's easy to miss:

Release Notes

Describe any user facing changes here, or delete this block.

Upgrade test cases are added.

@googlebot googlebot added the cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit label Aug 5, 2019
@tekton-robot tekton-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 5, 2019
@houshengbo houshengbo changed the title Add the upgrade tests Verify if pipeline still works after upgrading from previous release to current release 🧗‍♂️ Aug 5, 2019
@houshengbo houshengbo changed the title Verify if pipeline still works after upgrading from previous release to current release 🧗‍♂️ Verify if pipeline works after upgrading from previous release to current release 🧗‍♂️ Aug 5, 2019
@houshengbo houshengbo changed the title Verify if pipeline works after upgrading from previous release to current release 🧗‍♂️ WIP: Verify if pipeline works after upgrading from previous release to current release 🧗‍♂️ Aug 5, 2019
@tekton-robot tekton-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 5, 2019
@bobcatfish
Copy link
Collaborator

/test all

@bobcatfish
Copy link
Collaborator

I0805 20:06:55.021] 2019/08/05 20:06:55 error processing import paths in "config/controller.yaml": UNKNOWN: Unable to determine the upload's size.
I0805 20:06:55.099] ERROR: Build pipeline installation failed

hmmmm

/test pull-tekton-pipeline-integration-tests

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@houshengbo thx for starting work on this one 👼
I'll add a prow job for this 👼

test/e2e-tests-upgrade.sh Show resolved Hide resolved
@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 6, 2019
@houshengbo houshengbo changed the title WIP: Verify if pipeline works after upgrading from previous release to current release 🧗‍♂️ Verify if pipeline works after upgrading from previous release to current release 🧗‍♂️ Aug 6, 2019
@tekton-robot tekton-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 6, 2019
@houshengbo
Copy link
Author

/retest

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this, it's an important test to have!

My ideal solution would be to have this test running as a Tekton pipeline, but since we do not have the facility to do that yet, starting with a bash script is ok.

If I got it right, the current sequence is:

  • install previous release
  • upgrade to master
  • run E2E and YAML tests
  • uninstall
  • install previous release
  • run YAML tests (but not result check)
  • install master
  • run E2E and YAML tests

My opinion on this is that we should have upgrade targeted tests, as this whole sequence may take too long and not necessarily test all that we need to test - i.e. I'm not convinced this would have caught the issues we encountered until now with upgrade; however I'd like to see what other maintainers think too.
I'm not sure I fully understand the need to go through the upgrade twice.

It is fine to start with E2E or YAML tests until we develop more targeted tests, but I would like to have dedicated functions that wrap things to be done before and after the the upgrade, so that it's easy to add / remove things from there and migrate to a Tasks once we're ready for that.

Could you please change the commit message to match the recommended format?


for test in taskrun pipelinerun; do
header "Running YAML e2e tests for ${test}s"
if ! run_yaml_tests ${test}; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we really need to run all YAML tests here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, there are two scenarios we covered here. This is the first one.

done

# Remove all the pipeline CRDs
uninstall_pipeline_crd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to install and update two times?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afrittoli It is two cases actually as identified in: #1114
They are slightly different, not exactly the same.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we finish the first one, remove and clean up everything. Run the second scenario.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense @houshengbo - i had the same question tho! maybe there could be some comments that explain these scenarios in a bit more detail so it would be clear to someone who just sees this script?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add more comments on these two scenarios.


for test in taskrun pipelinerun; do
header "Applying the resources ${test}s"
apply_resources ${test}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This runs all YAML tests again, but it does not check the result, right?

I think it would be nice to create a targeted set of resource of different type (e.g. PipelineResources, a Task, a Pipeline and perhaps long running TaskRuns and PipelineRuns) and to verify them after the upgrade.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NO, this is a different case as I stated before.
We do apply_resources only here.

Then after install the crd, doing the upgrading, run_tests ${test}.

Copy link
Author

@houshengbo houshengbo Aug 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have noticed that apply_resources and run_tests used to be in one function, but I split them specially for the two scenarios for upgrade.

Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion on this is that we should have upgrade targeted tests

Could do! This might be an okay starting point in the meantime? It'll also be very hard for folks writing features to know whether they should add upgrade tests (or not! if we can provide some clear guidelines. its not obvious to me at the moment what the guidelines would be tho)

as this whole sequence may take too long and not necessarily test all that we need to test

In tektoncd/plumbing#56 the job is being added as presubmit - why don't we try a nightly periodic run instead? (e.g. like https://github.com/knative/test-infra/blob/master/ci/prow/config.yaml#L2821)

The downside is we'd need to keep an eye on the prow dashboard (https://prow.tekton.dev) - but we could at the very least check it before making a release?

# and deploy Tekton Pipelines to it for running integration tests.

source $(dirname $0)/e2e-common.sh
PREVIOUS_PIPELINE_VERSION=0.5.2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pretty minor, but since the v is actually part of the tag, maybe it would make sense to include here vs. appending it whereever the version variable gets used?

done

# Remove all the pipeline CRDs
uninstall_pipeline_crd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense @houshengbo - i had the same question tho! maybe there could be some comments that explain these scenarios in a bit more detail so it would be clear to someone who just sees this script?

@houshengbo
Copy link
Author

houshengbo commented Aug 29, 2019

@bobcatfish Regarding what type of prow job we should create for the upgrade tests, there have already been pros and cons to have a PR-based job or a periodic job.

Once we have a PR-based job, it is true that it runs on a PR basis, which consumes more infrastructure resources, but it can better guard the quality, since we know immediately whether the upgrade fails or not.

Once we have a periodic job, we need at least someone to watch the job at least on a daily, weekly, or whatever frequency to see if there is a problem, and furthermore we have to find out and locate which commit brings in the issue. Even if we find it, we need to commit new PR, and manually launch this upgrade tests to verify. Well if there is no problem, that's fine. However, if we have an issue, it will be very time-consuming to fix it.

I still tend to have a new prow job dedicated to do the upgrade tests. It is true it consume infrastructure resources, but the benefit to be automatic will pay back.

@houshengbo
Copy link
Author

houshengbo commented Aug 29, 2019

@afrittoli There are two scenarios at least now for us to cover as I tried to tackle in the issue #1114. That is why you see install, test, uninstall... install again... I will write more comments in the the script for better understanding.

@houshengbo
Copy link
Author

/retest all

@houshengbo
Copy link
Author

/test pull-tekton-pipeline-build-tests

@houshengbo
Copy link
Author

/test pull-tekton-pipeline-integration-tests

@tekton-robot tekton-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 29, 2019
@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 29, 2019
@bobcatfish
Copy link
Collaborator

I still tend to have a new prow job dedicated to do the upgrade tests. It is true it consume infrastructure resources, but the benefit to be automatic will pay back.

I'm less concerned about the infrastructure resources and more about turnaround time for contributors - the end to end tests are already the slowest check on each PR, and this would make them even slower :S

What do you think if we start out with making this a periodic job and then see how often it fails? If it's often enough that we want to catch it on PRs we can add it as a presubmit job. I'm assuming this would be infrequent enough that it wouldn't be too bad.

Another idea:
We could make it so that this check is not mandatory for merging - but that might be confusing.

Anyone else have other ideas @afrittoli @vdemeester @abayer @imjasonh @sbwsg @dibyom ?

@vdemeester
Copy link
Member

What do you think if we start out with making this a periodic job and then see how often it fails? If it's often enough that we want to catch it on PRs we can add it as a presubmit job. I'm assuming this would be infrequent enough that it wouldn't be too bad.

I do agree, we could have that as a periodic build for now so, not on all PRs.

@bobcatfish
Copy link
Collaborator

I do agree, we could have that as a periodic build for now so, not on all PRs.

kk - I'll move that discussion over to tektoncd/plumbing#56

In the meantime I think we can merge this. @houshengbo I think that "scenario 1" and "scenario 2" in the comments in the script could use some more detail but I also think we can iterate and merge this as is now:

/lgtm
/approve

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 30, 2019
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bobcatfish, houshengbo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 30, 2019
@houshengbo
Copy link
Author

/test pull-tekton-pipeline-build-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Release upgrade scenario tests
6 participants