Specify resource quota for a Pipeline #4271

concaf · 2021-10-04T11:53:09Z

concaf
Oct 4, 2021

Problem

Users should be able to specify resource quotas for a given Pipeline i.e. the sum of resources consumed by the concurrent running pods of the associated PipelineRun should not exceed the quota set for that particular Pipeline.

Right now, resource limits can only be set for steps and namespaces but users need to specify the limits as the total amount of resources consumed by a Pipeline.

When we talk about quota (and resources/limits) there are several different things to handle:

Number of objects (PipelineRun, TaskRun, Pipeline, …) - Object Count Quota
CPU, Memory, storage used per object (and pod in the end) - Compute Resource Quota

Here we are only looking at the compute resource quota.

Example problem:
Let us take an example of running a Maven build in a Pipeline - https://developers.redhat.com/blog/2020/02/26/speed-up-maven-builds-in-tekton-pipelines#build_a_pipeline_with_a_workspace

apiVersion: tekton.dev/v1alpha1
kind: Pipeline
metadata:
  name: maven-build
spec:
  workspaces:
  - name: local-maven-repo
  resources:
  - name: app-git
    type: git
  tasks:
  - name: build
    taskRef:
      name: mvn
    resources:
      inputs:
      - name: source
        resource: app-git
    params:
    - name: GOALS
      value: ["package"]
    workspaces:
    - name: maven-repo
      workspace: local-maven-repo
  - name: int-test
    taskRef:
      name: mvn
    runAfter: ["build"]
    resources:
      inputs:
      - name: source
        resource: app-git
    params:
    - name: GOALS
      value: ["verify"]
    workspaces:
    - name: maven-repo
      workspace: local-maven-repo
  - name: gen-report
    taskRef:
      name: mvn
    runAfter: ["build"]
    resources:
      inputs:
      - name: source
        resource: app-git
    params:
    - name: GOALS
      value: ["site"] 
    workspaces:
    - name: maven-repo
      workspace: local-maven-repo

Now, the resources used by this Pipeline will completely depend on the Java application that will be built. Tasks in the Pipeline will take different amounts of resources based on the input application; so, it’s not possible to know and hence, not possible to allocate resources at a per Step level. And as an admin, you cannot allow unspecified limits for pods on your cluster.

Limitations:

This is not a straightforward or a generic problem since it is largely use case dependent. One could ask/propose the following questions:

Why can we not configure this in Kubernetes?

Kubernetes does not allow applying resource quota to any subset of pods grouped by annotations or labels, or a pod owned by a custom resource (say, a TaskRun or a PipelineRun). It does allow, however, to set object count quota for custom resources (https://kubernetes.io/docs/concepts/policy/resource-quotas/#object-count-quota) but not CPU or memory which is what we want.

Why can we not use LimitRange?

A LimitRange can put constraints on pods or individual containers in a given namespace - but the scope is the entire namespace; there is no way to configure resources for a subset of pods in a namespace. So, it cannot distinguish pods running for a Pipeline from the rest of the pods.
For example, a Maven build task in a Pipeline requires much more resources than the pod for the resulting application. App operators need a way to be able to give enough quota for the Maven build for that application, without giving too much quota to all pods in that namespace.

LimitRange applies to all pods and does not distinguish between Tekton and other pods. If a Pipeline runs resource-intensive builds, the customer has to increase the limit for all pods via the LimitRange in that namespace which is not acceptable for the customer.

Why can we not use a per-Task (per-Step) resource limit?

In order to only consume the bare minimum amount of resources needed to execute one Step at a time from the invoked Task, Tekton only requests the maximum values for CPU, memory, and ephemeral storage from within each Step. This is sufficient as Steps only execute one at a time in the Pod.

Note to self: This behavior will change for 0.28 but the problem stays the same (it just supports LimitRange a bit better)

A per-Task resource limit does not help either since Task resource usage is dependent on what the Task is being used for. A Maven task for two different applications would take vastly different amounts of resources and a Task author cannot pre-determine that.

Current solutions (rather, workarounds):

Today, the users are presented with the following options in Tekton:

Resource requests and limits can be set on a per Step basis:
https://tekton.dev/docs/pipelines/tasks/#defining-steps
Resource limits can be set via LimitRange:
https://tekton.dev/docs/pipelines/pipelineruns/#specifying-limitrange-values
Follow guidelines around reducing resource consumption of Pipelines:
https://docs.openshift.com/container-platform/4.8/cicd/pipelines/reducing-pipelines-resource-consumption.html

Also broadly, to restrict resource consumption in your project, a user can:

Set and manage resource quotas to limit the aggregate resource consumption - https://docs.openshift.com/container-platform/4.8/applications/quotas/quotas-setting-per-project.html
Use limit ranges to restrict resource consumption for specific objects, such as pods, images, image streams, and persistent volume claims. - https://docs.openshift.com/container-platform/4.8/nodes/clusters/nodes-cluster-limit-ranges.html

A user can work with these options and get some control over resource usage in their cluster; users can “guess” resource quota per step, set broader limits via LimitRange and manage resource consumption to an extent however the broader problem of setting “resource quota for a Pipeline” remains unfixed.

Besides the above mentioned direct configuration options, users can also:

Keep Pipelines less generic

The more generic the pipelines are, the more difficult it is to predict resource quotas. This is more of a best practice where more specific pipelines are easier to apply resource quotas to.

Consider a Pipeline which builds a Maven application; this Pipeline could be used to build a heavy Spring Boot application or a simpler Hello World Java application. Resource consumption in both the cases will differ widely and hence, will be much harder to predict.

Now consider another Pipeline which is more specific to building Spring Boot applications via Maven. Resource quotas for this Pipeline will be much easier to predict and control.

Even though both the pipelines will essentially do the same thing but the specificity and intent helps in managing resources much more efficiently.

Parameterisation

https://tekton.dev/docs/pipelines/pipelines/#specifying-parameters
https://tekton.dev/docs/pipelines/tasks/#specifying-parameters

Customers can create Task parameters in all of their Tasks for the Step resource limits and then pass parameters from the Pipeline to all Tasks in that Pipeline so that they can control the resource limits from the Pipeline.

There are a few issues to make parameterisation work today:
#4080
#1530
#1548

Once they are fixed, parameterisation is a possible solution towards the problem of specifying resource quota in the pipeline.

Users will have to use external tooling till these issues are resolved. They can use envsusbt - https://www.gnu.org/software/gettext/manual/html_node/envsubst-Invocation.html and replace their custom parameters “before” creating the resources in the cluster.

However there are certain limitations to the entire parameterisation approach:

The end goal is to specify resource quota for all the concurrent running pods in a pipeline. Say, an admin allocates 10Gi memory to a pipeline that has:

5 tasks which execute in parallel. Via parameterisation, all the tasks could be allocated 2 Gi of memory each so the resulting consumption never exceeds 10 Gi.
5 tasks that run sequentially. Via parameterisation, all the tasks could be allocated 10 Gi of memory each so the resulting consumption never exceeds 10 Gi.
3 tasks that run sequentially and then 2 tasks that run in parallel. The sequential tasks can be allocated 10 Gi of memory each and the 2 parallel tasks can be allocated 5 Gi of memory each.

However, these calculations need to be made beforehand by the end user and can get very complicated very soon.

While this might be suitable for certain cases, in the broader view there are a few drawbacks:

Parameterisation increases the verbosity of the Pipeline and Task definitions. The resource definitions become difficult to read. This might be fixed by Passing parameters and resource Pipeline -> Task #1484 to an extent.
Since parameters are going to be different for every Pipeline, this reduces the shareability of a Pipeline.
The calculations required for parameterisation are going to be tedious and will often become inaccurate over simple modifications made in the resource definitions.

Proposed solutions and brainstorming

ResourceQuota, PriorityClass and ResourceQuota per PriorityClass

What is a PriorityClass?

https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass

A PriorityClass defines a mapping from a priority class name to the integer value of the priority. The higher the value, the higher the priority.
After you have created a PriorityClass, you can create Pods that specify that PriorityClass name in their specifications.

The scheduler orders pending Pods by their priority and a pending Pod is placed ahead of other pending Pods with lower priority in the scheduling queue. As a result, the higher priority Pod may be scheduled sooner than Pods with lower priority if its scheduling requirements are met. If such Pod cannot be scheduled, scheduler will continue and try to schedule other lower priority Pods.

What is a ResourceQuota?

https://kubernetes.io/docs/concepts/policy/resource-quotas/

A ResourceQuota provides constraints that limit aggregate resource consumption per namespace. It can limit the quantity of objects that can be created in a namespace by type, as well as the total amount of compute resources that may be consumed by resources in that namespace.

You can limit the total sum of compute resources that can be requested in a given namespace via Compute Resource Quota.

… and???

Great, so we can prioritize pod scheduling via PriorityClass and we can manage (compute) resource quota in a namespace via ResourceQuota, but how does this fix specifying resource quotas for a Pipeline?

What is the underlying implementation that we want to specify resource quota for a Pipeline?
We want to set resource quota for the entire subset of pods which are created by a PipelineRun; and while Kubernetes does not allow this (yet), we can achieve pretty much the same via ResourceQuota per PriorityClass.

Resource Quota Per PriorityClass -

https://kubernetes.io/docs/concepts/policy/resource-quotas/#resource-quota-per-priorityclass

Pods can be created at a specific priority and you can control a pod's consumption of system resources based on a pod's priority.

While this is not exactly what we wanted, this means that we can set resource quota for a subset of pods and those pods are “selected” via a given PriorityClass.

Let’s see an example:
Let’s create a PriorityClass for a given Pipeline:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: pipeline1-pc
value: 1000000
description: "Priority class for pipeline1"

Let’s create a ResourceQuota for that Pipeline:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: pipeline1-rq
spec:
  hard:
    cpu: "1000"
    memory: 200Gi
    pods: "10"
  scopeSelector:
    matchExpressions:
    - operator : In
      scopeName: PriorityClass
      values: ["pipeline1-pc"]

Let’s check the ResourceQuota for that Pipeline:

$ kubectl describe quota
 
Name:       pipeline1-rq
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     1k
memory      0     200Gi
pods        0     10

Since no pods have been created, nothing has been used from the quota.

Now, create the Pipeline and Tasks:

apiVersion: tekton.dev/v1alpha1
kind: Pipeline
metadata:
  name: maven-build
spec:
  workspaces:
  - name: local-maven-repo
  resources:
  - name: app-git
    type: git
  tasks:
  - name: build
    taskRef:
      name: mvn
    resources:
      inputs:
      - name: source
        resource: app-git
    params:
    - name: GOALS
      value: ["package"]
    workspaces:
    - name: maven-repo
      workspace: local-maven-repo
  - name: int-test
    taskRef:
      name: mvn
    runAfter: ["build"]
    resources:
      inputs:
      - name: source
        resource: app-git
    params:
    - name: GOALS
      value: ["verify"]
    workspaces:
    - name: maven-repo
      workspace: local-maven-repo
  - name: gen-report
    taskRef:
      name: mvn
    runAfter: ["build"]
    resources:
      inputs:
      - name: source
        resource: app-git
    params:
    - name: GOALS
      value: ["site"] 
    workspaces:
    - name: maven-repo
      workspace: local-maven-repo

apiVersion: tekton.dev/v1alpha1
kind: Task
metadata:
  name: mvn
spec:
  workspaces:
  - name: maven-repo
  inputs:
    params:
    - name: GOALS
      description: The Maven goals to run
      type: array
      default: ["package"]
    resources:
    - name: source
      type: git
  steps:
    - name: mvn
      image: gcr.io/cloud-builders/mvn
      workingDir: /workspace/source
      command: ["/usr/bin/mvn"]
      args:
        - -Dmaven.repo.local=$(workspaces.maven-repo.path)
        - "$(inputs.params.GOALS)"
      priorityClassName: pipeline1-pc

Make sure all the tasks in the given Pipeline belong to the same PriorityClass.

Now, you can run the Pipeline:

apiVersion: tekton.dev/v1alpha1
kind: PipelineRun
metadata:
  generateName: petclinic-run-
spec:
  pipelineRef:
    name: maven-build
  resources:
  - name: app-git
    resourceSpec:
      type: git
      params:
        - name: url
          value: https://github.com/spring-projects/spring-petclinic

Let’s check the ResourceQuota for that Pipeline again:

$ kubectl describe quota
 
Name:       pipeline1-rq
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         500m  1k
memory      10Gi  200Gi
pods        1     10

This means that a combined resource quota for all the concurrent running pods belonging to a PriorityClass can be managed via ResourceQuota per PriorityClass.

Fixing this in upstream Kubernetes

While the above solution can work today, the use of PriorityClass is not very ideal. PriorityClass is used to assign priority to pods which is not what we are using it for.
In the long run, it would be worthwhile to evaluate and contribute this feature in upstream Kubernetes where resource quota can be set to a subset of pods selected via annotations or labels OR to custom resources. Since object count quota is already available for custom resources, compute resource quota could also be a viable proposal upstream.

Fixing this is Tekton itself

There could be a couple of ways where setting resource quota for a Pipeline could be supported in upstream Tekton.

If we were to leverage LimitRange, then a PipelineRun that requires a resource quota could be created in an arbitrary namespace created at runtime with the resource quota set as LimitRange in that arbitrary namespace and executed there. However, this seems like an overkill, imperfect and not clean, maybe?

Tekton could have its own resource quota manager in the controller where the resource usage for a given Pipeline is kept track of and pods are set out for scheduling only when resource quota conditions are met. While this might be worthwhile for advanced use cases, it might also become a slippery slope and maintenance burden.

References, further reading and notes:

quota support for custom resources kubernetes/kubernetes#53777
Support object count quota for custom resources kubernetes/kubernetes#64201
Support object count quota for custom resources kubernetes/kubernetes#64201
You can set quota for the total number of certain resources of all standard, namespaced resource types - The same syntax can be used for custom resources. For example, to create a quota on a widgets custom resource in the example.com API group, use count/widgets.example.com - https://kubernetes.io/docs/concepts/policy/resource-quotas/#object-count-quota
https://kubernetes.io/docs/concepts/policy/resource-quotas/#resource-quota-for-extended-resources
https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#extended-resources
Resourcequota have the notion of terminating/nonterminating pods too - https://docs.openshift.com/container-platform/4.8/applications/quotas/quotas-setting-per-project.html - https://issues.redhat.com/browse/SRVKP-1661 is gonna change them from nonterminating to terminating - (this is a detail but can "help" a bit more even I think)
Allow to use variable replacement when defining resource limits and requests #4080

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specify resource quota for a Pipeline #4271

{{title}}

Replies: 0 comments

Select a reply

Specify resource quota for a Pipeline #4271

concaf Oct 4, 2021

Problem

Limitations:

Why can we not configure this in Kubernetes?

Why can we not use LimitRange?

Why can we not use a per-Task (per-Step) resource limit?

Current solutions (rather, workarounds):

Keep Pipelines less generic

Parameterisation

Proposed solutions and brainstorming

ResourceQuota, PriorityClass and ResourceQuota per PriorityClass

What is a PriorityClass?

What is a ResourceQuota?

… and???

Resource Quota Per PriorityClass -

Fixing this in upstream Kubernetes

Fixing this is Tekton itself

References, further reading and notes:

Replies: 0 comments

concaf
Oct 4, 2021