TEP-0044: Data Locality and Pod Overhead in Pipelines

Summary
Motivation
- Pod overhead
- Difficulty of moving data between Tasks
Existing Workarounds and Mitigations
Goals
Non-Goals
- Use Cases
Requirements
Design Considerations
Design proposal
Alternatives
References

Summary

As stated in Tekton's reusability design principles, Pipelines and Tasks should be reusable in a variety of execution contexts. However, because each TaskRun is executed in a separate pod, Task and Pipeline authors indirectly control the number of pods used in execution. This introduces both the overhead of extra pods and friction associated with moving data between Tasks.

This TEP lists the pain points associated with running each TaskRun in its own pod and describes the current features that mitigate these pain points. It explores several additional execution options for Pipelines but does not yet propose a preferred solution.

Motivation

The choice of one pod per Task works for most use cases for a single TaskRun, but can cause friction when TaskRuns are combined in PipelineRuns. These problems are exacerbated by complex Pipelines with large numbers of Tasks. There are two primary pain points associated with coupling each TaskRun to an individual pod: the overhead of each additional pod and the difficulty of passing data between Tasks in a Pipeline.

Pod overhead

Pipeline authors benefit when Tasks are made as self-contained as possible, but the more that Pipeline functionality is split between modular Tasks, the greater the number of pods used in a PipelineRun. Each pod consumes some system resources in addition to the resources needed to run each container and takes time to schedule. Therefore, each additional pod increases the latency of and resources consumed by a PipelineRun.

Difficulty of moving data between Tasks

Many Tasks require some form of input data or emit some form of output data, and Pipelines frequently use Task outputs as inputs for subsequent Tasks. Common Task inputs and outputs include repositories, OCI images, events, or unstructured data copied to or from cloud storage. Scheduling TaskRuns on separate pods requires these artifacts to be stored somewhere outside of the pods. This could be storage within a cluster, like a PVC, configmap, or secret, or remote storage, like a cloud storage bucket or image repository.

Workspaces make it easier to "shuttle" data through a Pipeline by abstracting details of data storage out of Pipelines and Tasks. They currently support only forms of storage within a cluster (PVCs, configmaps, secrets, and emptydir). Abstracting data storage out of Pipeline and Task definitions helps make them more reusable, but doesn't address the underlying problem that some form of external data storage is needed to pass artifacts between TaskRuns.

The need for data storage locations external to pods introduces friction in a few different ways. First, moving data between storage locations can incur monetary cost and latency. There are also some pain points associated specifically with PVCs, the most common way of sharing data between TaskRuns. Creating and deleting PVCs (typically done with each PipelineRun) incurs additional load on the kubernetes API server and storage backend, increasing PipelineRun latency. In addition, some systems support only the ReadWriteOnce access mode for PVCs, which allows the PVC to be mounted on a single node at a time. This means that Pipeline TaskRuns that share data and run in parallel must run on the same node.

Lastly, Tekton's current implementation requires users to understand that TaskRuns are run as pods, and therefore, they need to provide persistent storage because pods do not share a filesystem. Even if creating pods and PVCs had no overhead, users would still need to understand how to provide data storage even for simple examples like a "clone, build, push" Pipeline.

The following issues describe some of these difficulties in more detail:

Issue: Difficult to use parallel Tasks that share files using workspace: This issue provides more detail on why it's difficult to share data between parallel tasks using PVCs.
Feature Request: Pooled PersistentVolumeClaims: This user would like to attach preallocated PVCs to PipelineRuns and TaskRuns rather than incurring the overhead of creating a new one every time.
@mattmoor's feedback on PipelineResources and the Pipeline beta: The experience of running a common, fundamental workflow is made more difficult by having to use PVCs to move data between pods.
Issue: Exec Steps Concurrent in task (task support DAG): This user would like to be able to run Task Steps in parallel, because they do not want to have to use workspaces with multiple pods.
Another comment on the previous issue from a user who would like to be able to run Steps in parallel, but doesn't feel that running a Pipeline in a pod would address this use case because they don't want to turn their Steps into Tasks.
Question: without using persistent volume can i share workspace among tasks?: This user uses an NFS for their workspace to avoid provisioning a PVC on every PipelineRun.
FR: Allow volume from volumeClaimTemplate to be used in more than one workspace: This issue highlights usability concerns with using the same PVC in multiple workspaces (done using sub-paths).

Existing Workarounds and Mitigations

There's currently no workaround that addresses the overhead of extra pods or storage without harming reusability.

Combine multiple pieces of functionality in one Task

Instead of combining functionality provided by Tasks into Pipelines, a Task or Pipeline author could use Steps or a multifunctional script to combine all necessary functionality into a single Task. This allows multiple "actions" to be run in one pod, but hurts reusability and makes parallel execution more difficult.

Use PipelineResources (deprecated) to express a workflow in one Task

PipelineResources allowed multiple pieces of functionality to run in a single pod by building some of these functions into the TaskRun controller. This allowed some workflows to be written as single Tasks. For example, the "git" and "image" PipelineResources made it possible to create a workflow that cloned and built a repo, and pushed the resulting image to an image repository, all in one Task. However, PipelineResources still required forms of storage external to pods, like PVCs. In addition, PipelineResources hurt reusability because they required Task authors to anticipate what other functionality would be needed before and after the Task. For this reason, among others, they were deprecated; please see TEP-0074 for more information.

Rely on the Affinity Assistant for improved TaskRun scheduling

The Affinity Assistant schedules TaskRuns that share a PVC on the same node. This feature allows TaskRuns that share PVCs to run in parallel in a system that supports only ReadWriteOnce Persistent Volumes. However, this does not address the underlying issues of pod overhead and the need to shuttle data between TaskRuns in different pods. It also comes with its own set of drawbacks, which are described in more detail in TEP-0046: Colocation of Tasks and Workspaces.

Use Task results to share data without using PVCs

Tasks may emit string results that can be used as parameters of subsequent Tasks. There is an existing TEP for supporting dictionary and array results as well. However, results are not designed to handle large, arbitrary forms of data like source repositories or images. While there is some ongoing discussion around supporting large results, result data would still need to be stored externally to pods.

Goals

Make it possible to combine Tasks together so that you can run multiple Tasks together and have control over the pods and volumes required.
Provide a mechanism to colocate Tasks that execute some "core logic" (e.g. a build) with Tasks that fetch inputs (e.g. git clone) or push outputs (e.g. docker push).

Non-Goals

Updating the Task CRD to allow Tasks to reference other Tasks at Task authoring time. We could decide to include this if we have some use cases that need it; for now avoiding this allows us to avoid many layers of nesting (i.e. Task1 uses Task2 uses Task3, etc.) or even worse, recursion (Task 1 uses Task 2 uses Task 1...)
Replacing all functionality that was provided by PipelineResources. See TEP-0074 for the deprecation plan for PipelineResources.
Building functionality into Tekton to determine which Tasks should be combined together, as opposed to letting a user configure this. We can explore providing this functionality in a later iteration of this proposal.

Use Cases

A user wants to use catalog Tasks to checkout code, run unit tests and upload outputs, and does not want to incur the additional overhead (and performance impact) of creating volume based workspaces to share data between them in a Pipeline.
An organization does not want to use PVCs at all; for example perhaps they have decided on uploading to and downloading from buckets in the cloud (e.g. GCS). This could be accomplished by colocating a cloud storage upload Task with the Task responsible for other functionality.
An organization is willing to use PVCs to some extent but needs to put limits on their use.
A user has decided that the overhead required in spinning up multiple pods is too much and wants to be able to have more control over this.

Requirements

Tasks can be composed and run together:

Must be able to share data without requiring a volume external to the pod
Must be possible to run multiple Tasks as one pod

It should be possible to have Tasks that run even if others fail; i.e. the Task can be run on the same pod as another Task that fails

This is to support use cases such as uploading test outputs, even if the test Task failed
This requirement is being included because we could choose a solution that doesn't address the above use case.

The chosen solution should allow us to provide both authoring time and runtime configuration to colocate multiple Tasks.

We will start with only configuration for one of (authoring time, runtime) and optionally provide the other based on user feedback.

The status of each TaskRun should be displayed separately to the user, with one TaskRun per Task.
- PipelineRuns currently specify this information in a taskruns section, but we are planning on removing the bulk of the information stored in this field.

Design Considerations

Almost every proposed solution involves running multiple Tasks in one pod, and some involve running an entire Pipeline in a pod. This section details pod constraints that will need to be addressed by the chosen design.

Some constraints will need to be addressed by any solution running multiple Tasks in one pod. For example, because each pod has only one ServiceAccount, each Task run in a pod must use the same ServiceAccount. Other pod constraints are relevant only for Pipeline level features. For example, users can use PipelineRun context in TaskRun parameters, and supporting this feature in a pod might require entrypoint changes. Some functionality blurs the line between Pipeline functionality and functionality of just a group of Tasks (for example, considering just the Pipeline's "Tasks" field and not its "Finally" field), like the ability to run Tasks in parallel and pass inputs and outputs between them.

The following list details how pod constraints affect Pipeline features. Deciding what design to implement will require deciding which of these features are in and out of scope and what level of abstraction is most appropriate for handling them.

Pipeline functionality supported in pods

Currently, the TaskRun controller creates one pod per TaskRun, with one container per Step. Pod containers run in parallel, but the entrypoint binary run in the pod ensures that each Step waits for the previous one to finish by writing Step metadata to a pod volume and having each container wait until a previous Step's outputs are available to begin executing.

Some functionality required to run multiple Tasks in a pod could be supported with existing pod construction and entrypoint code; some functionality would require changes to this code, and some functionality may not be possible at all.

Functionality that could be supported with current pod logic (e.g. by translating a Pipeline directly to a TaskRun):
- Sequential tasks (specified using runAfter)
- Parallel tasks
- String params
- Array params
- Workspaces
- Pipeline level results
- Workspace features:
  - mountPaths
  - subPaths
  - optional
  - readOnly
  - isolated
- Specifying Tasks in a Pipeline via Bundles (all bundles would have to be fetched before execution starts)
- step templates
- timeout
Functionality that could be supported with updated pod construction logic:
- Step resource requirements
  - Any solution that runs multiple Tasks in one pod will need to determine how container resource requirements should be set based on the resource requirements of each Task.
- Sidecars
  - We may need to wrap sidecar commands such that sidecars don't start executing until their corresponding Task starts
  - We will also need to handle the case where multiple Tasks define sidecars with the same name
Functionality that would require additional orchestration within the pod (e.g. entrypoint changes):
- Passing results between tasks
- retries
- Contextual variable replacement that assumes a PipelineRun, for example context.pipelineRun.name
- When expressions (and Conditions)
- Finally tasks
- Allowing step failure
Functionality that would require significantly expanded orchestration logic:
- Custom tasks - the pod would need to be able to create and watch Custom Tasks, or somehow lean on the Pipelines controller to do this
Functionality that might not be possible (i.e. constrained by pods themselves):
- Any dynamically created TaskRuns. This is discussed in more detail below.
- Running each Task with a different ServiceAccount - the pod has one ServiceAccount as a whole

Dynamically created TaskRuns in Pipelines

Currently, the number of TaskRuns created from a Pipeline is determined at authoring time based on the number of items in a Pipeline's Tasks field. However, TEP-0090: Matrix proposes a feature that would allow additional TaskRuns to be created when a PipelineRun is executed. In summary, a Pipeline may need to run a Task multiple times with different parameters. The parameters "fanned out" to a Task run multiple times in parallel may be specified at authoring time, or they may come from an earlier Task. For example, a Pipeline might clone a repo, read a set of parameters from the repo, and run a build or test task once for each of these parameters.

A controller responsible for running multiple Tasks in one pod must know how many Tasks will be run before creating the pod. This is because a pod will start executing once it has been created, and many of the fields (including the containers list) cannot be updated. However, the number of Tasks needed may not be known until the previous Task is run and its outputs are retrieved. Therefore, we may be able to support running a matrixed Pipeline in a pod only when the full set of parameters is known at the start of execution. We may not be able to support dynamic matrix parameters or other forms of dynamic Task creation.

Hermekton support

TEP-0025 proposes specifying "hermeticity" levels for Tasks or Steps. Hermetic Tasks and Steps cannot communicate with non-hermetic Tasks and Steps during execution, meaning that all inputs will be specified prior to Task/Step start. This requires isolating the network and filesystem for hermetic Tasks/Steps.

Use Cases

TEP-0025 describes some use cases for specifying different levels of hermeticity for Tasks in a Pipeline. It's not yet clear whether users would like the ability to specify different levels of hermeticity for Tasks that are part of a PipelineRun executed in a pod. Security-minded users might prefer to run their Tasks with different service accounts, which is not possible in one pod.

Feasibility

A Task can only be considered hermetic if it is unable to communicate with other Tasks during execution, and likewise for Steps. If we wanted to support different levels of hermeticity for Tasks run in the same pod, we would need to provide a way for the Steps in the hermetic Task to communicate with other Steps in that Task, but not with Steps in other Tasks.

Containers in a pod can communicate either via ports or via the filesystem.

Isolating the filesystem of a Task run in the same pod as other Tasks is likely feasible. A pod can provide shared volumes for containers to use, meaning that we can control how containers communicate via the filesystem by controlling which of the pod's volumes they have access to.

Isolating the network of a Task run in the same pod as other Tasks is more challenging. Containers in a pod share a network namespace, and can communicate with each other via localhost. This means that it's straightforward to restrict a single container's access to the network within the pod by preventing it from communicating via localhost, as is proposed for Step-level hermeticity. However, it's much more challenging to allow a group of containers to communicate with each other, but not with other containers in a pod, a feature that would be necessary to run hermetic and non-hermetic Tasks in the same pod. We could explore using EBPF to control the container network, but this is likely a large amount of effort and would not work on all platforms.

We could work around this limitation via a few options:

Requiring that hermetic Tasks have only 1 step if they are run in a pod with other Tasks.
Not allowing Steps within hermetic Tasks to communicate with each other.
Requiring that hermetic Tasks not execute in parallel with other Tasks run in the same pod.

Controller role in scheduling TaskRuns

Some solutions to this problem involve allowing a user to configure which TaskRuns they would like to be executed on one pod, and some solutions allow the controller to determine which TaskRuns should be executed on one pod.

For example, if we decide to create a TaskGroup abstraction, we could decide that all Tasks in a TaskGroup should be executed on the same pod, or that the controller gets to decide how to schedule TaskRuns in a TaskGroup. Similarly, we could provide an option to execute a Pipeline in a pod, or an option to allow the PipelineRun controller to determine which TaskRuns should be grouped.

We should first tackle the complexity of running multiple TaskRuns on one pod before tackling the complexity of determining which TaskRuns should be scheduled together. A first iteration of this proposal should require the user to specify when they would like TaskRuns to be combined together. After experimentation and user feedback, we can explore providing an option that would rely on the controller to make this decision.

Pipelines that build images

A frequent Tekton use case is a Pipeline that builds an image, and then runs that image in a subsequent Task. Running such a Pipeline in a single pod would require Tekton to update the image used in the downstream Tasks after a build Task has completed. While container images may be updated after pod creation, we must consider how Tekton would know whether to replace an image, and how the new image could be re-wrapped with the Tekton entrypoint before the Task begins to execute.

Additional Design Considerations

Executing an entire Pipeline in a pod, as compared to executing multiple Tasks in a pod, may pave the way for supporting local execution.

Design proposal

Design doc (not yet accepted)
Experimental Custom Task

Summary

The Tekton Pipelines controller will be updated to support running a Pipeline in a pod. Users will choose which parts of a pipeline to run in a pod by grouping them into a sub-Pipeline, using the Pipelines in Pipelines feature. Any Pipelines grouped under a Pipeline executed in a pod will also execute in that pod.

We will create a TaskRun per Pipeline Task (or multiple TaskRuns, for matrixed Pipeline Tasks) and use it as the source of truth for the status of the execution of that Task.

Additional design details and naming still under discussion.

Rollout plan

Required Features

Sequential Tasks. This TEP aims to mitigate the challenges of passing data between subsequent Tasks, such as a git clone followed by a docker build, so we must support sequential Tasks in one pod.
Workspaces and Results. These features are also critical for passing data between Tasks.
Finally Tasks. Finally Tasks should be supported for use cases which involve uploading or processing PipelineRun outputs, such as data from previous Tasks contained in Workspaces.
Parallel Tasks. Another primary use case for this TEP is to support parallel Tasks that share data, which today typically must be run in separate pods on the same node. Running parallel Tasks on the same pod wouldn't allow computation to be balanced between nodes, but it would allow users to avoid paying performance and UX penalties associated with PVCs or other workspace backing options. In addition, finally Tasks, which are also a required feature, run in parallel.
Params. Params are critical to the reusability of many catalog Tasks, and implementation in a pod will require few changes.

Future Work

The following features are not critical to address data locality, but they are useful for PipelineRuns executed in single pods for the same reasons that they are useful in PipelineRuns executed in multiple pods. Not all of these features will be equally easy to implement in a pod, so we should collect user feedback on which items are the most critical to support in a pod and tackle the highest-value and lowest-effort features before moving on to others on this list.

when expressions
retries
timeouts
task-level sidecars
Matrixed Tasks (fanned out from params)

Of these features, matrixed tasks and pipeline-level timeouts are likely the highest-priority items to support.

Use cases for the following features are likely impacted by the execution mode:

hermetic fences between Tasks: see Hermekton support
Custom Tasks: Users frequently use Custom Tasks for actions that don’t make sense to run in a pod, so the logic of a Custom Task wouldn’t make sense to run in a pod with other Tasks. However, a Pipeline run in a pod could still contain Custom Tasks whose logic is run elsewhere.

The following features don’t yet exist, but we could choose to add them for Pipeline in a pod:

Pipeline- or Task-level resource requirements: Users may be more likely to care about the resource usage of the pod rather than individual containers, since Kubernetes uses the pod’s resource requirements to make decisions on scheduling and eviction.
Pipeline-level sidecars: This feature is requested in issue 4235.

Unsupported Features

The following features are likely infeasible, regardless of how useful they may be.

Matrixed Tasks (fanned out from results): See Dynamically created TaskRuns in Pipelines.
Different service accounts for different Tasks.

Milestones

Milestone 1: Support the bare minimum functionality required to run multiple sequential Tasks in one pod, including Tasks defined inline or as TaskRefs.
Milestone 2: Support params, results, and workspaces.
Milestone 3: Support parallel Tasks.
Milestone 4: Support finally Tasks.
Milestone 5: Support matrixed Tasks fanned out from parameters and pipeline-level timeouts.

At this point, we will have implemented most of the data locality functionality lost by deprecating PipelineResources. We should collect user feedback on the feature, and if needed, move on to support other features listed in future work.

Drawbacks

We will not be able to support all of Pipelines' functionality in a pod. This could be confusing for users, especially if Pipelines can be configured to run in a pod at runtime but not all authoring time features are supported in a pod.

Alternatives

Pipeline executed as a TaskRun
Allow Pipeline Tasks to contain other Tasks
Automatically combine Tasks that share the same Workspaces
Add grouping to Tasks in a Pipeline or PipelineRun
Combine Tasks based on runtime values of Workspaces
Controller option to execute Pipelines in a Pod
TaskRun controller allows Tasks to contain other Tasks
Remove distinction between Tasks and Pipelines
Create a TaskGroup abstraction
Support other ways to share data (e.g. buckets)
Task Pre and Post Steps

Pipeline executed as a TaskRun

This is the approach currently taken in the Pipeline to TaskRun experimental custom task. In this approach, a CustomTask turns a PipelineRun into a TaskRun, and the TaskRun controller executes this TaskRun in a pod.

Pros:

Permits reuse of existing CRDs with less re-architecting than having the PipelineRun controller run a PipelineRun in a pod.

Cons:

We would be limited in the features we could support to features that TaskRuns already support, or we'd have to add more Pipeline features to TaskRuns.
Does not surface outputs of Pipeline Tasks separately. Breaks the 1-1 relationship between Tasks and TaskRuns.

Allow Pipeline Tasks to contain other Tasks

In this option, Pipeline Tasks can refer to other Tasks, which are resolved at runtime and run sequentially in one pod.

This addresses the common use case of a Task needing to be colocated with some inputs and outputs, but may not generalize well to more complex Pipelines. One example of a Pipeline that would not be able to run its Tasks in a pod using this strategy is a Pipeline with a "fan out" and then "fan in" structure. For example, a Pipeline could clone a repo in its first Task, use that data in several subsequent Tasks in parallel, and then have a single Task responsible for cleanup or publishing outputs.

In the following example, 3 Tasks will be combined and run in one pod sequentially:

git-clone
just-unit-test
gcs-upload

apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: build-test-deploy
spec:
 params:
  - name: url
    value: https://github.com/tektoncd/pipeline.git
  - name: revision
    value: v0.11.3
 workspaces:
  - name: source-code
  - name: test-results
 tasks:
 - name: run-unit-tests
   taskRef:
     name: just-unit-tests
   workspaces:
      - name: source-code
      - name: test-results
   init/before:
      - taskRef: git-clone
        params:
       - name: url
          value: $(params.url)
        - name: revision
          value: $(params.revision)
        workspaces:
        - name: source-code
          workspace: source-code
    finally/after:
      - taskRef: gcs-upload
        params:
        - name: location
          value: gs://my-test-results-bucket/testrun-$(taskRun.name)
        workspaces:
        - name: data
          workspace: test-results

The finally/after Task(s) would run even if the previous steps fail.

Pros:

Is an optional addition to the existing types (doesn't require massive re-architecting)
We have some initial indication (via PipelineResources) that this should be possible to do
Maintains a line between when to use a complex DAG and when to use this functionality since this is only sequential (but the line is fuzzy)

Cons:

Developing a runtime syntax for this functionality will be challenging
Only helps us with some scheduling problems (e.g. doesn't help with parallel tasks or finally task execution)
What if you don't want the last Tasks to run if the previous tasks fail?
- Not clear how we would support more sophisticated use cases, e.g. if folks wanted to start mixing when expressions into the before/init and/or finally/after Tasks
If you want some other Task to run after these, you'll still need a workspace/volume + separate pod
What if you want more flexibility than just before and after? (e.g. you want to completely control the ordering)
- Should still be possible, can put as many Tasks as you want into before and after

Task Specialization: most appealing options?
TEP-0054 suggests something similar to this but:
- Uses "steps" as the unit
- Wants to combine these in the embedded Task spec vs in the Pipeline Task

Automagically combine Tasks that share the same workspaces

In this option we could leave Pipelines as they are, but at runtime instead of mapping a Task to a pod, we could decide what belongs in what pod based on workspace usage.

In the example below, get-source, run-unit-tests and upload-results are all at least one of the two workspaces so they will be executed as one pod, while update-slack would be run as a separate pod:

apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: build-test-deploy
spec:
 params:
  - name: url
    value: https://github.com/tektoncd/pipeline.git
  - name: revision
    value: v0.11.3
 workspaces:
  - name: source-code
  - name: test-results
 tasks:
 - name: get-source
   workspaces:
   - name: source-code
     workspace: source-code
   taskRef:
     name: git-clone
   params:
   - name: url
      value: $(params.url)
    - name: revision
      value: $(params.revision)
 - name: run-unit-tests
   runAfter: get-source
   taskRef:
     name: just-unit-tests
   workspaces:
   - name: source-code
     workspcae: source-code
   - name: test-results
     workspace: test-results
 - name: upload-results
   runAfter: run-unit-tests
   taskRef:
     name: gcs-upload
   params:
   - name: location
     value: gs://my-test-results-bucket/testrun-$(taskRun.name)
   workspaces:
   - name: data
     workspace: test-results
finally:
- name: update-slack
  params:
  - name: message
    value: "Tests completed with $(tasks.run-unit-tests.status) status"

Possible tweaks:

We could do this scheduling only when a Task requires a workspace from another Task.
We could combine this with other options but have this be the default behavior

Pros:

Doesn't require any changes for Pipeline or Task authors
Allows execution related concerns to be determined at runtime

Cons:

Will need to update our entrypoint logic to allow for containers running in parallel
Doesn't give as much flexibility as being explicit
- This functionality might not even be desirable for folks who want to make use of multiple nodes
  - We could mitigate this by adding more configuration, e.g. opt in or out at a Pipeline level, but could get complicated if people want more control (e.g. opting in for one workspace but not another)
Removes ability to run Tasks on separate pods if data is shared between them.

Add "grouping" to Tasks in a Pipeline or PipelineRun

In this option we add some notion of "groups" into a Pipeline; any Tasks in a group will be scheduled together. Consider the following Pipeline definition:

kind: Pipeline
metadata:
  name: build-test-deploy
spec:
 params:
  - name: url
    value: https://github.com/tektoncd/pipeline.git
  - name: revision
    value: v0.11.3
 workspaces:
  - name: source-code
  - name: test-results
 tasks:
 - name: get-source
   workspaces:
   - name: source-code
     workspace: source-code
   taskRef:
     name: git-clone
   params:
   - name: url
      value: $(params.url)
    - name: revision
      value: $(params.revision)
 - name: run-unit-tests
   runAfter: get-source
   taskRef:
     name: just-unit-tests
   workspaces:
   - name: source-code
     workspace: source-code
   - name: test-results
     workspace: test-results
 - name: upload-results
   runAfter: run-unit-tests
   taskRef:
     name: gcs-upload
   params:
   - name: location
     value: gs://my-test-results-bucket/testrun-$(taskRun.name)
   workspaces:
   - name: data
     workspace: test-results
finally:
- name: update-slack
  params:
  - name: message
    value: "Tests completed with $(tasks.run-unit-tests.status) status"

The following "group" definition could be specified in either the Pipeline or the PipelineRun:

groups:
- [get-source, run-unit-tests, upload-results]

This "grouping" would result in the Tasks get-source, run-unit-tests, and upload-results being run in the same pod.

Alternatively, Tasks could be grouped using labels.

Pros:

Minimal changes for Pipeline authors
Allows Pipeline to run multiple Tasks in one pod without having to support all of a Pipeline's functionality in a pod

Cons:

No reason to introduce a new grouping syntax when we plan to have Pipelines in Pipelines.

Combine Tasks based on runtime values of Workspaces

In this solution we use the values provided at runtime for workspaces to determine what to run. Specifically, we allow emptyDir to be provided as a workspace at the Pipeline level even when that workspace is used by multiple Tasks, and when that happens, we take that as the cue to schedule those Tasks together.

For example given this Pipeline:

kind: Pipeline
metadata:
  name: build-test-deploy
spec:
 workspaces:
  - name: source-code
  - name: test-results
 tasks:
 - name: get-source
   workspaces:
   - name: source-code
     workspace: source-code
   taskRef:
     name: git-clone
   params:
   - name: url
      value: $(params.url)
    - name: revision
      value: $(params.revision)
 - name: run-unit-tests
   runAfter: get-source
   taskRef:
     name: just-unit-tests
   workspaces:
   - name: source-code
     workspace: source-code
   - name: test-results
     workspace: test-results
 - name: upload-results
   runAfter: run-unit-tests
   taskRef:
     name: gcs-upload
   params:
   - name: location
     value: gs://my-test-results-bucket/testrun-$(taskRun.name)
   workspaces:
   - name: data
     workspace: test-results

Running with this PipelineRun would cause get-source and run-unit-tests to be run in one pod, with upload-results in another:

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: run
spec:
  pipelineRef:
    name: build-test-deploy
  workspaces:
  - name: source-code
    emptyDir: {}
  - name: test-results
    persistentVolumeClaim:
      claimName: mypvc

Running with this PipelineRun would cause all of the Tasks to be run in one pod:

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: run
spec:
  pipelineRef:
    name: build-test-deploy
  workspaces:
  - name: source-code
    emptyDir: {}
  - name: test-results
    emptyDir: {}

Running with this PipelineRun would cause all of the Tasks to be run in separate pods:

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: run
spec:
  pipelineRef:
    name: build-test-deploy
  workspaces:
  - name: source-code
    persistentVolumeClaim:
      claimName: otherpvc
  - name: test-results
    persistentVolumeClaim:
      claimName: mypvc

Pros:

Allows user to configure decisions about scheduling at runtime without changing the Pod

Cons:

If it's important for a Pipeline to be executed in a certain way, that information will have to be encoded somewhere other than the Pipeline
For very large Pipelines, this default behavior may cause problems (e.g. if the Pipeline is too large to be scheduled into one pod)
Compared to the "task group" solution, this solution provides similar functionality but lends itself less well to adding authoring time configuration later.

Controller option to execute Pipelines in a pod

In this option, the Tekton controller can be configured to always execute Pipelines inside one pod. This would require similar functionality to the pipeline in a pod, but provide less flexibility to Task and Pipeline authors, as only cluster administrators will be able to control scheduling.

TaskRun controller allows Tasks to contain other Tasks

This solution is slightly different from the "Allow Pipeline Tasks to contain other Tasks" solution, as this option would be implemented on the TaskRun controller rather than the PipelineRun controller. It would permit creating a graph or sequence of Tasks that are all run in the same pod, while maintaining Task reusability. However, it blurs the line between responsibility of a Task and responsibility of a Pipeline. It would likely lead to us re-implementing Pipeline functionality within Tasks, such as finally Tasks and when expressions.

Remove distinction between Tasks and Pipelines

In this version, we try to combine Tasks and Pipelines into one thing; e.g. by getting rid of Pipelines and adding all the features they have to Tasks, and by giving Tasks the features that Pipelines have which they do not have. The new abstraction will be able to run in a pod.

Things Tasks can do that Pipelines can't:

Sidecars
Refer to images (including args to images like script, command, args, env....)

Things Pipelines can do that Tasks can't:

Create DAGs, including running in parallel
Finally
When expressions

For example, say our new thing is called a Process:

kind: Process
metadata:
  name: git-clone
spec:
 workspaces:
  - name: source-code
 processes:
 - name: get-source
   steps: # or maybe each Process can only have 1 step and we need to use runAfter / dependencies to indicate ordering?
    - name: clone
      image: gcr.io/tekton-releases/github.com/tektoncd/pipeline/cmd/git-init:v0.21.0
      script: <script here>
   workspaces:
   - name: source-code
     workspace: source-code
 finally:
 # since we merged these concepts, any Process can have a finally

kind: Process
metadata:
  name: build-test-deploy
spec:
 workspaces:
  - name: source-code
  - name: test-results
 processes:
 - name: get-source
   workspaces:
   - name: source-code
     workspace: source-code
   processRef: # processes could have steps or processRefs maybe?
     name: git-clone # uses our Process above
 - name: run-unit-tests
   runAfter: get-source
   steps:
  - name: unit-test
    image: docker.io/library/golang:$(params.version)
    script: <script here>
   workspaces:
   - name: source-code
     workspcae: source-code
   - name: test-results
     workspace: test-results
 - name: upload-results
   runAfter: run-unit-tests
   processRef:
     name: gcs-upload
   params:
   - name: location
     value: gs://my-test-results-bucket/testrun-$(taskRun.name)
   workspaces:
   - name: data
     workspace: test-results
finally:
- name: update-slack
  params:
  - name: message
    value: "Tests completed with $(tasks.run-unit-tests.status) status"

We will need to add something to indicate how to schedule Processes now that we won't have the convenience of drawing the line around Tasks; we could combine this idea with one of the others in this proposal.

Pros:

Maybe the distinction between Tasks and Pipelines has just been making things harder for us
Maybe this is a natural progression of the "embedding" we already allow in pipelines?
We could experiment with this completely independently of changing our existing CRDs (as long as we don't want Processes to be called Task or Pipeline XD - even then we could use a different API group)
Might help with use cases where someone wants parallel Steps within a Task, e.g. this comment

Cons:

Pretty dramatic API change. Requires users to update their setups to accommodate a whole new abstraction.
This requires implementing Pipeline in a pod functionality. There's no reason to add more complexity on top of Pipeline in a pod when that solution would address the issues detailed above.

Create a TaskGroup abstraction

In this approach we create a new Tekton type called a "TaskGroup", which can be implemented as a new CRD or a Custom Task. TaskGroups may be embedded in Pipelines. We could create a new TaskGroup controller or use the existing TaskRun controller to schedule a TaskGroup.

The controller would be responsible for creating one TaskRun per Task in the TaskGroup, and scheduling each of these TaskRuns in the same pod. The controller would be responsible for reconciling both the TaskGroup and the TaskRuns created from the TaskGroup.

The controller would need to determine how many TaskRuns are needed when the TaskGroup is first reconciled, due to limitations associated with dynamically creating Tasks. When the TaskGroup is first reconciled, it would create all TaskRuns needed, with those that are not ready to execute marked as "pending", and a pod with one container per TaskRun. The TaskGroup would store references to any TaskRuns created, and Task statuses would be stored on the TaskRuns.

In a future version of this solution, we could explore allowing the TaskGroup/TaskRun controller to determine how to schedule TaskRuns. For example, it could create a pod and schedule all the TaskRuns on it, or, if a single pod running all the Tasks is too large to be scheduled, it could split the TaskRuns between multiple pods. We could introduce configuration options to specify whether the controller should attempt to split up TaskRuns or simply fail if a single pod wouldn't be schedulable.

Pros:

Creating a single TaskRun for each Task would allow individual Task statuses to be surfaced separately.
Allows us to choose which Pipeline features to support, and marks a clear distinction for users between supported and unsupported features.
Having the Pipeline controller create TaskRuns up front (as "pending" or similar) might have other benefits, for example we've struggled in the past with how to represent the status of Tasks in a Pipeline which don't have a backing TaskRun, e.g. they are skipped or cancelled. Now there actually would be a TaskRun backing them.

Cons:

Unclear benefit compared to adding a grouping syntax within a Pipeline and letting the PipelineRun controller handle scheduling
We would likely end up supporting features like finally for both Pipelines and TaskGroups (and generally reusing a lot of the PipelineRun controller's code in the TaskGroup controller)
Must create all TaskRuns in advance
New CRD to contend with
Extra complexity for Task/Pipeline authors
Grouping decision can only be made at authoring time
Does not follow the Tekton reusability design principle "existing features should be reused when possible instead of adding new ones".

Support other ways to share data (e.g. buckets)

In this approach we could add more workspace types that support other ways of sharing data between pods, for example uploading to and downloading from s3. This doesn't address the problems of pod overhead and having to use data storage external to a pod to share data between Tasks. However, we may choose to proceed with this work independently of this TEP.

Task Pre and Post Steps

This strategy is proposed separately in TEP-0080. In summary, this TEP proposes allowing TaskRuns to have "pre" steps responsible for downloading some data and "post" steps responsible for uploading some outputs. The "main" steps would be able to run hermetically, while the pre and post steps would have network access.

Pros:

Meets requirements that multiple pieces of functionality can be run in one pod with different hermeticity options and no external data storage.

Cons:

Uses Step as a re-usable unit rather than Task. Tasks become less reusable, as they must anticipate what external data storage systems will be used on either end. This was one of the reasons PipelineResources were deprecated.
Less flexible than running multiple Tasks in one pod, as functionality must fit the model of "before steps", "during steps", and "after steps". Might not map neatly to more complex combinations of functionality, such as a DAG.

References

Tekton PipelineResources Beta Notes
Why aren't PipelineResources in beta?
@mattmoor's feedback on PipelineResources and the Pipeline beta)
PipelineResources 2 Uber Design Doc
Investigate if we can run whole PipelineRun in a Pod - TEP-0046
On Task re-usability, compos-ability and co-location
Task specialization:
- Specializing Tasks - Vision & Goals
- Task specialization - most appealing options?
Issues:

Files

0044-data-locality-and-pod-overhead-in-pipelines.md

Latest commit

History

0044-data-locality-and-pod-overhead-in-pipelines.md

File metadata and controls

TEP-0044: Data Locality and Pod Overhead in Pipelines

Summary

Motivation

Pod overhead

Difficulty of moving data between Tasks

Existing Workarounds and Mitigations

Combine multiple pieces of functionality in one Task

Use PipelineResources (deprecated) to express a workflow in one Task

Rely on the Affinity Assistant for improved TaskRun scheduling

Use Task results to share data without using PVCs

Goals

Non-Goals

Use Cases

Requirements

Design Considerations

Pipeline functionality supported in pods

Dynamically created TaskRuns in Pipelines

Hermekton support

Use Cases

Feasibility

Controller role in scheduling TaskRuns

Pipelines that build images

Additional Design Considerations

Design proposal

Summary

Rollout plan

Required Features

Future Work

Unsupported Features

Milestones

Drawbacks

Alternatives

Pipeline executed as a TaskRun

Allow Pipeline Tasks to contain other Tasks

Automagically combine Tasks that share the same workspaces

Add "grouping" to Tasks in a Pipeline or PipelineRun

Combine Tasks based on runtime values of Workspaces

Controller option to execute Pipelines in a pod

TaskRun controller allows Tasks to contain other Tasks

Remove distinction between Tasks and Pipelines

Create a TaskGroup abstraction

Support other ways to share data (e.g. buckets)

Task Pre and Post Steps

References