Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust the way {pipeline,task}run metrics are surfaced. #4204

Merged
merged 1 commit into from
Sep 1, 2021

Conversation

mattmoor
Copy link
Member

@mattmoor mattmoor commented Aug 31, 2021

It's been bugging me that NewController methods for the TaskRun and PipelineRun controllers launch a go routine to harvest metrics, and it occurred to me that there might be a better way.

Borrowing from the CloudEvent client's use of hand-crafted injection logic:

This change takes a similar approach, creating pkg/{task,pipeline}runmetrics packages, which surface their *Recorder via .Get(ctx) methods. Rather than the controller spinning off go routine, this piggybacks on informer injection to "Start()" that process after the dependent informers have been started.

/kind cleanup

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Docs included if any changes are user facing
  • Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including
    functionality, content, code)
  • Release notes block below has been filled in or deleted (only if no user facing changes)

Release Notes

NONE

@tekton-robot tekton-robot added release-note-none Denotes a PR that doesnt merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Aug 31, 2021
@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 31, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pipelinerunmetrics/injection.go Do not exist 18.2%
pkg/pipelinerunmetrics/metrics.go Do not exist 75.4%
pkg/reconciler/taskrun/controller.go 95.8% 95.0% -0.8
pkg/taskrunmetrics/injection.go Do not exist 18.2%
pkg/taskrunmetrics/metrics.go Do not exist 81.7%

It's been bugging me that `NewController` methods for the `TaskRun` and `PipelineRun` controllers launch a go routine to harvest metrics, and it occurred to me that there might be a better way.

Borrowing from the CloudEvent client's use of hand-crafted injection logic: https://github.com/tektoncd/pipeline/blob/7297c48d26da98552be4ee3c50d94a130bd8e79d/pkg/reconciler/events/cloudevent/cloudeventclient.go#L29

This change takes a similar approach, creating `pkg/{task,pipeline}runmetrics` packages, which surface their `*Recorder` via `.Get(ctx)` methods.  Rather than the controller spinning off go routine, this piggybacks on informer injection to "Start()" that process after the dependent informers have been started.
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pipelinerunmetrics/injection.go Do not exist 18.2%
pkg/pipelinerunmetrics/metrics.go Do not exist 75.4%
pkg/reconciler/taskrun/controller.go 95.8% 95.0% -0.8
pkg/taskrunmetrics/injection.go Do not exist 18.2%
pkg/taskrunmetrics/metrics.go Do not exist 81.7%

@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

/retest

the alpha tests are very flaky 😞

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @khrm as it affects #4201

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 1, 2021
@dlorenc
Copy link
Contributor

dlorenc commented Sep 1, 2021

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 1, 2021
@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

Hmm, the race detector blew up, gonna take a quick look at where

@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

    testing.go:1092: race detected during execution of test
FAIL
FAIL	github.com/tektoncd/pipeline/pkg/reconciler/taskrun	6.991s

I don't see the usual stack trace in the blended -v output, so I'm going to try and repro this locally with -race -count=10 🤞

@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

Nothing at 10, going for 100 while I walk the dog :)

@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

Well all I got was:

race: limit on 8128 simultaneously alive goroutines is exceeded, dying

Let's see if prow can repro it.

/retest

@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

I guess not.

@tekton-robot tekton-robot merged commit b13fc13 into tektoncd:main Sep 1, 2021
@mattmoor mattmoor deleted the metrics-fake-injection branch September 1, 2021 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesnt merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants