Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split reconcile into prepare and reconcile #2421

Merged
merged 1 commit into from
Apr 24, 2020

Conversation

afrittoli
Copy link
Member

Changes

The current "reconcile" function does a lot of things, which means
it's hard to test properly, it's hard to follow the logic and
it's hard to handle all possible errors and events properly.

This splits reconcile into two parts, the first one that deals
preparing and validating the taskrun and all the resources it
depends on, the second one that actually reconciles the TaskRun
by creating the pod if required and updating the TaskRun from
the pod.

This adds events that were missing for error situation that
happens during preparation and validation.

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

See the contribution guide for more details.

Double check this list of stuff that's easy to miss:

Reviewer Notes

If API changes are included, additive changes must be approved by at least two OWNERS and backwards incompatible changes must be approved by more than 50% of the OWNERS, and they must first be added in a backwards compatible way.

Release Notes

The TaskRun controller now emits events also when error occurs during the validation of the TaskRun or the resources it depends on.

@tekton-robot tekton-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 16, 2020
@tekton-robot tekton-robot requested review from dlorenc and a user April 16, 2020 17:51
@googlebot googlebot added the cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit label Apr 16, 2020
@tekton-robot tekton-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 16, 2020
@afrittoli afrittoli force-pushed the split_prepare_reconcile branch from fca8bbd to 5d67a4d Compare April 16, 2020 18:03
@afrittoli afrittoli force-pushed the split_prepare_reconcile branch from 5d67a4d to af1cbe3 Compare April 17, 2020 12:22
@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 17, 2020
@afrittoli afrittoli changed the title WIP Split reconcile into prepare and reconcile Split reconcile into prepare and reconcile Apr 17, 2020
@tekton-robot tekton-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 17, 2020
@ghost ghost added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Apr 17, 2020
@afrittoli afrittoli force-pushed the split_prepare_reconcile branch 2 times, most recently from 7ce886b to 9467425 Compare April 17, 2020 17:08
@ghost
Copy link

ghost commented Apr 17, 2020

/approve

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 17, 2020
@afrittoli afrittoli force-pushed the split_prepare_reconcile branch 4 times, most recently from 9fa0ad0 to bb2ffb4 Compare April 20, 2020 12:04
@afrittoli
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

@afrittoli
Copy link
Member Author

I cannot reproduce any of the integration test failures on my test cluster, even after multiple runs. I think that something in the logic that waits for events is not going well, not waiting long enough or perhaps starting to wait too soon.

@afrittoli afrittoli force-pushed the split_prepare_reconcile branch from b8ff62c to e57c6f5 Compare April 21, 2020 18:32
@afrittoli afrittoli force-pushed the split_prepare_reconcile branch 2 times, most recently from 26768b7 to 7897553 Compare April 21, 2020 20:10
@afrittoli afrittoli force-pushed the split_prepare_reconcile branch from aa74c56 to cdec415 Compare April 22, 2020 16:32
@afrittoli
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

@afrittoli
Copy link
Member Author

/test pull-tekton-pipeline-build-tests

1 similar comment
@afrittoli
Copy link
Member Author

/test pull-tekton-pipeline-build-tests

@afrittoli
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

pkg/reconciler/taskrun/taskrun.go Outdated Show resolved Hide resolved
test/init_test.go Outdated Show resolved Hide resolved
pkg/reconciler/taskrun/taskrun.go Outdated Show resolved Hide resolved
@afrittoli afrittoli force-pushed the split_prepare_reconcile branch 7 times, most recently from 685a28d to 2e396a1 Compare April 23, 2020 15:18
@afrittoli
Copy link
Member Author

/test pull-tekton-pipeline-build-tests

@afrittoli
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

@afrittoli afrittoli force-pushed the split_prepare_reconcile branch from 2e396a1 to cf305d6 Compare April 24, 2020 09:36
@afrittoli
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

@afrittoli
Copy link
Member Author

/test pull-tekton-pipeline-build-tests
/test pull-tekton-pipeline-integration-tests

@afrittoli afrittoli force-pushed the split_prepare_reconcile branch from cf305d6 to 5e5bb4e Compare April 24, 2020 15:46
The current "reconcile" function does a lot of things, which means
it's hard to test properly, it's hard to follow the logic and
it's hard to handle all possible errors and events properly.

This splits reconcile into two parts, the first one that deals
preparing and validating the taskrun and all the resources it
depends on, the second one that actually reconciles the TaskRun
by creating the pod if required and updating the TaskRun from
the pod.

This adds events that were missing for error situations that
happen during preparation and validation.

The new events go past the burst limit of the event
broadcaster, which delays some events a lot for a pipeline
with a few tasks. Tweaking the default burst rate and QPS is
required to pass CI. We might want to add those configs to a
dedicated config map in future.
@afrittoli afrittoli force-pushed the split_prepare_reconcile branch from 5e5bb4e to a963e63 Compare April 24, 2020 16:26

correlatorOptions := record.CorrelatorOptions{
// The default burst size is 25
BurstSize: 50,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fine for now but should be configurable right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll follow up on this

correlatorOptions := record.CorrelatorOptions{
// The default burst size is 25
BurstSize: 50,
QPS: 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the number of queries per second that we can emit? I couldn't figure out after reading the godoc for this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the way I understood this is the QPS we allow after a burst

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dibyom, sbwsg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dibyom
Copy link
Member

dibyom commented Apr 24, 2020

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 24, 2020
@tekton-robot tekton-robot merged commit c11c6af into tektoncd:master Apr 24, 2020
@afrittoli afrittoli added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants