Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve logging for cip-auditor #183

Closed
listx opened this issue Feb 26, 2020 · 13 comments
Closed

improve logging for cip-auditor #183

listx opened this issue Feb 26, 2020 · 13 comments
Assignees
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@listx
Copy link
Contributor

listx commented Feb 26, 2020

Notes from initial prod deployment of cip-auditor

On 2020-02-25 3:57PM, @listx executed ./deploy.sh k8s-artifacts-prod 5c9a73a90fdfd08adaf3d53ae863144922958911952dfa67cbf9a0d33dedde63 in order to deploy the cip-auditor Cloud Run service to the k8s-artifacts-prod project for the first time.

Shortly thereafter, @thockin pushed up a dummy image to us.gcr.io/k8s-artifacts-prod/suspicious/envelope@sha256:ec3ca3ee90e4dafde96c83232b30f17b5e8992ff35479d0661b8f4ff2f21bf74 in order to test the auditor. The goal was to see at least logs of the auditor detecting a bad change to GCR.

However, the default Cloud Run logs failed to show any meaningful messages:
image

This was very confusing.

About an hour and a half later, it was discovered that cip-auditor did in fact run and process Pub/Sub messages as intended. However, these log messages did not show up in the Cloud Run instance because their Stackdriver resource.type was set to project and not cloud_run_revision, which is what the Cloud Run "Logs" tab picks up by default. The auditor currently prints interesting bits to the log named projects/k8s-artifacts-prod/logs/cip-audit-log, and is not shown currently in the Cloud Run dashboard.

We can improve this situation by simply making sure that all logs we write to the projects/k8s-artifacts-prod/logs/cip-audit-log log are also written to STDOUT for the cip-auditor process.

@listx listx self-assigned this Feb 26, 2020
@listx
Copy link
Contributor Author

listx commented Feb 26, 2020

Aside: the e2e tests that we run for the auditing mechanism did not pick this up because those tests grep through logs in the explicitly-named log cip-audit-log.

@listx
Copy link
Contributor Author

listx commented Feb 26, 2020

So, I've discovered that I can set the loggers to be more specific about the labels that are attached to them. Upon creation of the loggers, we can set more options as described here https://godoc.org/cloud.google.com/go/logging#Client.Logger. Specifically, we can probably set resource.type to cloud_run_revision and this should be enough to have logs show up in the Cloud Run "Logs" dashboard, which is what we want. Then there is no need to duplicate log statements every time we write to the log named cip-audit-log.

@bartsmykla
Copy link

@listx do you think we can change this issue to be good-first-time-issue maybe?

@listx
Copy link
Contributor Author

listx commented Mar 9, 2020

@listx do you think we can change this issue to be good-first-time-issue maybe?

I'm not sure, mainly because the verification requires some jumping around the GCP console which is something not everyone is familiar with.

@bartsmykla
Copy link

got it

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 8, 2020
@listx
Copy link
Contributor Author

listx commented Jun 9, 2020

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 9, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 7, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 7, 2020
@justaugustus justaugustus removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 16, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 14, 2021
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 13, 2021
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

5 participants