Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how PVC access modes affect pipeline execution #2521

Merged

Conversation

jlpettersson
Copy link
Member

Changes

The access mode configured for a PVC used as a workspace volume source
may affect how a pipeline is executed, this is not well documented. This commit
intend to improve the documentation about this and also provide an example
on how to run parallel tasks when using PVC with ReadWriteOnce access mode.

  • Document how access mode affect task ordering in pipeline under "Specifying Workspace order in a Pipeline"
  • Provide a full pipeline example of how to use parallel tasks when using a PVC with access mode ReadWriteOnce
  • We do not provide a "PipelineRun example" under "Example PipelineRun definitions using Workspaces". This commit provide a full PipelineRun example using a workspace. Instead of providing examples of different volume sources here, examples on how to use different VolumeSources is moved to "Specifying VolumeSources in Workspaces"
  • The VolumeSources persistentVolumeClaim and volumeClaimTemplate both is a PVC, and PVCs has its own peculiarities, e.g. access mode, that we now document. Both PVC volume sources is moved to a section so we can document the common peculiarities in a single place
  • Add the workspace variable introduced in Add variable substitution for PVC name #2506

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

See the contribution guide for more details.

Double check this list of stuff that's easy to miss:

Reviewer Notes

If API changes are included, additive changes must be approved by at least two OWNERS and backwards incompatible changes must be approved by more than 50% of the OWNERS, and they must first be added in a backwards compatible way.

/kind documentation

@tekton-robot tekton-robot added the kind/documentation Categorizes issue or PR as related to documentation. label May 1, 2020
@tekton-robot tekton-robot requested review from dlorenc and a user May 1, 2020 10:22
@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 1, 2020
@tekton-robot
Copy link
Collaborator

Hi @jlpettersson. Thanks for your PR.

I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thankyou for this additional documentation, really useful to know!

docs/workspaces.md Outdated Show resolved Hide resolved
docs/workspaces.md Outdated Show resolved Hide resolved
docs/workspaces.md Outdated Show resolved Hide resolved
information, see the [`runAfter` documentation](pipelines.md#runAfter).

**Warning:** You *must* ensure that this order is compatible with your configured access modes for your `PersistentVolumeClaim`.
Parallel `Tasks` using the same `PersistentVolumeClaim` with access mode `ReadWriteOnce`, may execute on
different nodes and be forced to execute sequentially which may cause `Tasks` to time out.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the k8s scheduler force this sequential ordering automatically or is it something the user is forced to do by adding runAfter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The k8s scheduler puts pods on nodes - immediately, they all mount the PVC. But the mount only succeeds on one node at the time. Pods on other nodes is stuck in state "Pending" - but they eventually start to run when the volume is not used on other nodes anymore.

And when the pods in the "other nodes" gets into "Running" - the Task is already started - so it may have "timed out" as was already documented.

But on my single node cluster - all tasks can run concurrently, so I did not see that parallel tasks on other nodes may not start until the volume is available.

I did not know how this was working, so I had to create a bigger test to validate this. Me and @skaegi discussed about it in the last API WG.

docs/workspaces.md Outdated Show resolved Hide resolved
docs/workspaces.md Outdated Show resolved Hide resolved
@@ -46,6 +46,8 @@ spec:
workspace: ws
subPath: dir-1
- name: writer-2
runAfter:
- writer-1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! This example was working already I think? So here we are just kinda documenting the ordering which the PVC accessMode already forced?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was "working" but the tasks was declared "parallel". So:

  • the two tasks will successfully run concurrently if scheduled to the same node
  • the two tasks may successfully run sequentially if scheduled to different nodes and the second task does not time out
  • the second task may time out if it is scheduled to another node and it takes too long time before it can execute/finish (I dont know how we implement time out)

Consequentially: If this is used in regression testing - the test may be flaky sometimes.

But this fixes so the two tasks always run in a sequence. For parallel tasks we should do as in the example provided in this commit. Unless we know that it will not time out.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, great catch!

@ghost
Copy link

ghost commented May 1, 2020

/ok-to-test

@tekton-robot tekton-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 1, 2020
@jlpettersson jlpettersson force-pushed the document_peculiarities_with_pvc branch from 2c17d70 to 38603ab Compare May 1, 2020 11:33
@jlpettersson
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbwsg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 1, 2020
@jlpettersson
Copy link
Member Author

I don't see why the integration tests fails, but I'll give it a new try...

/test pull-tekton-pipeline-integration-tests

@jlpettersson
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

1 similar comment
@jlpettersson
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

The access mode configured for a PVC that is used as a workspace volume source
may affect how a pipeline is executed, this is not well documented. This commit
intend to improve the documentation about this and also provide an example
on how to run parallel tasks when using PVC with ReadWriteOnce access mode.

- Document how access mode affect task ordering in pipeline under "Specifying Workspace order in a Pipeline"
- Provide a full pipeline example of how to use parallel tasks when using a PVC with access mode ReadWriteOnce
- We we did provide a "PipelineRun example" under "Example PipelineRun definitions using Workspaces".
  This commit provide a full PipelineRun example using a workspace. Instead of providing examples of
  different volume sources here, examples on how to use different VolumeSources is moved to
  "Specifying VolumeSources in Workspaces"
- The VolumeSources persistentVolumeClaim and volumeClaimTemplate both is a PVC, and PVCs has its own
  peculiarities, e.g. access mode that we document. Both PVC volume sources is moved to a section
  so we can document the common peculiarities in a single place
- Add the workspace variable introduced in tektoncd#2506
@jlpettersson jlpettersson force-pushed the document_peculiarities_with_pvc branch from 38603ab to 4e8a7a5 Compare May 1, 2020 18:57
@jlpettersson
Copy link
Member Author

Error from server (AlreadyExists): error when creating "STDIN": tasks.tekton.dev "writer" already exists

There was actually a problem with naming in my example - name collision with another example file. I changed name on my example task.

@jlpettersson
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

2 similar comments
@jlpettersson
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

@jlpettersson
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!
/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label May 2, 2020
@jlpettersson
Copy link
Member Author

/test pull-tekton-pipeline-build-tests

@jlpettersson
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

1 similar comment
@jlpettersson
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

@tekton-robot tekton-robot merged commit 7bbc7eb into tektoncd:master May 2, 2020
jlpettersson added a commit to jlpettersson/pipeline that referenced this pull request May 16, 2020
This commit contains similar changes for the TaskRun section as what was done for the PipelineRun section in tektoncd#2521

- We did not provide a full example of a TaskRun definition under "Examples of `TaskRun` definition using `Workspaces`"
  This commit add a full example of a TaskRun definition.

- Examples of how to use different volume sources already exists under "Specifying `VolumeSources` in `Workspaces`",
  add a link to that section instead of having the same examples under the TaskRun section.

- A few broken links in the table-of-contents are fixed.

/kind documentation
jlpettersson added a commit to jlpettersson/pipeline that referenced this pull request May 17, 2020
This commit contains similar changes for the TaskRun section as what was done for the PipelineRun section in tektoncd#2521

- We did not provide a full example of a TaskRun definition under "Examples of `TaskRun` definition using `Workspaces`"
  This commit add a full example of a TaskRun definition.

- Examples of how to use different volume sources already exists under "Specifying `VolumeSources` in `Workspaces`",
  add a link to that section instead of having the same examples under the TaskRun section.

- A few broken links in the table-of-contents are fixed.

/kind documentation
tekton-robot pushed a commit that referenced this pull request May 18, 2020
This commit contains similar changes for the TaskRun section as what was done for the PipelineRun section in #2521

- We did not provide a full example of a TaskRun definition under "Examples of `TaskRun` definition using `Workspaces`"
  This commit add a full example of a TaskRun definition.

- Examples of how to use different volume sources already exists under "Specifying `VolumeSources` in `Workspaces`",
  add a link to that section instead of having the same examples under the TaskRun section.

- A few broken links in the table-of-contents are fixed.

/kind documentation
pritidesai pushed a commit to pritidesai/pipeline that referenced this pull request May 19, 2020
This commit contains similar changes for the TaskRun section as what was done for the PipelineRun section in tektoncd#2521

- We did not provide a full example of a TaskRun definition under "Examples of `TaskRun` definition using `Workspaces`"
  This commit add a full example of a TaskRun definition.

- Examples of how to use different volume sources already exists under "Specifying `VolumeSources` in `Workspaces`",
  add a link to that section instead of having the same examples under the TaskRun section.

- A few broken links in the table-of-contents are fixed.

/kind documentation
jlpettersson added a commit to jlpettersson/community that referenced this pull request May 19, 2020
Jonas has recently become a regularly contributor. He started with adding a minor [_missing_ `omitempty`](tektoncd/pipeline#2301) and then [proposed some ideas](tektoncd/pipeline#1986 (comment)) around workspaces and PersistentVolumeClaim creation and continued to [elaborate around those ideas](tektoncd/pipeline#1986 (comment)). A sunny day a few days later, he also submitted an [extensive implementation for volumeClaimTemplate](tektoncd/pipeline#2326), corresponding to the idea discussions.

A few days later submitted a [small refactoring PR](tektoncd/pipeline#2392), and he also listened to community members that [proposed changes](tektoncd/pipeline#2450) to his implementation about volumeClaimTemplates and did an [implementation for that proposal](tektoncd/pipeline#2453).

A rainy day, he also wrote [technical documentation about PVCs](tektoncd/pipeline#2521) including adding an example that caused _flaky_ integration tests for the whole community during multiple days. When he understood his mistake, he submitted a [removal of the example](tektoncd/pipeline#2546) that caused flaky tests.

He has also put his toe into Tekton Catalog and [contributed to the buildah task](tektoncd/pipeline#2546).

This has followed, mostly with more PRs to the Pipeline project:

- tektoncd/pipeline#2460
- tektoncd/pipeline#2491
- tektoncd/pipeline#2502
- tektoncd/pipeline#2506
- tektoncd/pipeline#2632
- tektoncd/pipeline#2633
- tektoncd/pipeline#2634
- tektoncd/pipeline#2636
- tektoncd/pipeline#2601
- tektoncd/pipeline#2630

Jonas is excited about the great community around Tekton and the project! He now would like to join the org.
tekton-robot pushed a commit to tektoncd/community that referenced this pull request May 20, 2020
Jonas has recently become a regularly contributor. He started with adding a minor [_missing_ `omitempty`](tektoncd/pipeline#2301) and then [proposed some ideas](tektoncd/pipeline#1986 (comment)) around workspaces and PersistentVolumeClaim creation and continued to [elaborate around those ideas](tektoncd/pipeline#1986 (comment)). A sunny day a few days later, he also submitted an [extensive implementation for volumeClaimTemplate](tektoncd/pipeline#2326), corresponding to the idea discussions.

A few days later submitted a [small refactoring PR](tektoncd/pipeline#2392), and he also listened to community members that [proposed changes](tektoncd/pipeline#2450) to his implementation about volumeClaimTemplates and did an [implementation for that proposal](tektoncd/pipeline#2453).

A rainy day, he also wrote [technical documentation about PVCs](tektoncd/pipeline#2521) including adding an example that caused _flaky_ integration tests for the whole community during multiple days. When he understood his mistake, he submitted a [removal of the example](tektoncd/pipeline#2546) that caused flaky tests.

He has also put his toe into Tekton Catalog and [contributed to the buildah task](tektoncd/pipeline#2546).

This has followed, mostly with more PRs to the Pipeline project:

- tektoncd/pipeline#2460
- tektoncd/pipeline#2491
- tektoncd/pipeline#2502
- tektoncd/pipeline#2506
- tektoncd/pipeline#2632
- tektoncd/pipeline#2633
- tektoncd/pipeline#2634
- tektoncd/pipeline#2636
- tektoncd/pipeline#2601
- tektoncd/pipeline#2630

Jonas is excited about the great community around Tekton and the project! He now would like to join the org.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/documentation Categorizes issue or PR as related to documentation. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants