Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce runHistoryLimit #2332

Closed
jlpettersson opened this issue Apr 5, 2020 · 5 comments
Closed

Introduce runHistoryLimit #2332

jlpettersson opened this issue Apr 5, 2020 · 5 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@jlpettersson
Copy link
Member

Expected Behavior

Assume I have a namespace with a Pipeline and a Trigger with a connected git-repo. Each commit to my git-repo would typically send a webhook to the Trigger and a new PipelineRun is created, e.g. for CI-builds.

After some time, I will have a lot of PipelineRun resources in my namespace.

I expect that Tekton, to some extent can garbage collect older resources, for sustainable use without manual intervention. I expect to be able to configure a history limit, per team-namespace or Pipeline - some apps/teams may want a longer history and some want to keep it shorter e.g. for resource cost when using e.g. volumeClaimTemplate that allocates a PersistentVolume as long as the PipelineRun exists.

Actual Behavior

Resources created by the Trigger or other tools need to be deleted manually.

Additional Info

Kubernetes Deployment and StatefulSet has a field revisionHistoryLimit that reminds of this feature.

Deployment has a different resource structure, it creates ReplicaSets and "supervise" them, and also delete them. See https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/deployment/sync.go#L432-L438

// cleanupDeployment is responsible for cleaning up a deployment ie. retains all but the latest N old replica sets
// where N=d.Spec.RevisionHistoryLimit. Old replica sets are older versions of the podtemplate of a deployment kept
// around by default 1) for historical reasons and 2) for the ability to rollback a deployment.
func (dc *DeploymentController) cleanupDeployment(oldRSs []*apps.ReplicaSet, deployment *apps.Deployment) error {

Proposed Design

How can this be added to Tekton? It is the PipelineRun and TaskRun that is created from a TriggerTemplate that need to have a history limit, e.g. runHistoryLimit

What controller should be responsible for the limit? Should any part of the Trigger be responsible, or should the pipelinerun-reconciler and taskrun-reconciler be responsible?

Some of this has been discussed in tektoncd/triggers#64 but OwnerReference is probably not the full solution here.

@imjasonh
Copy link
Member

imjasonh commented Apr 5, 2020

One way to configure this would be in a configmap in the Tekton installation, that would configure it installation/cluster-wide.

Another option would be to add it as an option of the Trigger, and the triggerer would be responsible for deleting (and have permission🤔). This also only solves it for the triggering case, and ignores manual runs, or runs not triggered by "Triggers".

@imjasonh
Copy link
Member

imjasonh commented Apr 5, 2020

Also related: #454

@vdemeester
Copy link
Member

/kind feature

@tekton-robot tekton-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 6, 2020
@ghost
Copy link

ghost commented Apr 6, 2020

We explored this idea a bit in #1486 and its related issue #1302. In that PR the idea was to use a TTL Controller to clean up resources.

It was then suggested that we could use a CronJob for this, and so we punted on implementing anything at the controller / reconciler level. I added an issue to the experimental repo to start trying this out, but I don't think anyone has put more focus into it since then.

@jlpettersson
Copy link
Member Author

I agree that a CronJob as in tektoncd/experimental#479 could solve this problem unless we have a better option. It is the Trigger that create new *Run now, a possible solution could be to also "garbage collect" there, but I don't see a good place for the limit-field there. So I am fine with a "garbage collect"-CronJob as a solution for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants