Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to latest klog 0.4.0 #81164

Merged
merged 2 commits into from
Aug 13, 2019

Conversation

dims
Copy link
Member

@dims dims commented Aug 8, 2019

kubernetes/klog@47ffc4e - Add test case for detecting data race
kubernetes/klog@959d342 - Prevent data race in SetOutput* methods
kubernetes/klog@34123a4 - Test with golang 1.12.x
kubernetes/klog@bf4884f - Fix the log duplication issue for --log-file
kubernetes/klog@dc5546c - Backfill integration tests for selecting log destinations
kubernetes/klog@baef93d - fix broken links
kubernetes/klog@07b218b - Add go modules files
kubernetes/klog@b33ae69 - Add flag to include file's dir in logs
kubernetes/klog@7c58910 - correct documentation
kubernetes/klog@a4033db - Code Hygene - glog to klog
kubernetes/klog@941d47b - Revert "Fix the log duplication issue for --log-file."
kubernetes/klog@314f6c4 - Update godoc for the default value of logtostderr
kubernetes/klog@7eb35e2 - Fix the log duplication issue for --log-file.

What type of PR is this?
/kind cleanup

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


47ffc4e Add test case for detecting data race
959d342 Prevent data race in SetOutput* methods
34123a4 Test with golang 1.12.x
bf4884f Fix the log duplication issue for --log-file
dc5546c Backfill integration tests for selecting log destinations
baef93d fix broken links
07b218b Add go modules files
b33ae69 Add flag to include file's dir in logs
7c58910 correct documentation
a4033db Code Hygene - glog to klog
941d47b Revert "Fix the log duplication issue for --log-file."
314f6c4 Update godoc for the default value of logtostderr
7eb35e2 Fix the log duplication issue for --log-file.
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Aug 8, 2019
@dims
Copy link
Member Author

dims commented Aug 8, 2019

/milestone v1.16
/priority important-soon
/area code-organization

@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Aug 8, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.16 milestone Aug 8, 2019
@k8s-ci-robot k8s-ci-robot added area/code-organization Issues or PRs related to kubernetes code organization and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 8, 2019
@dims
Copy link
Member Author

dims commented Aug 8, 2019

/sig release

@k8s-ci-robot k8s-ci-robot added sig/release Categorizes an issue or PR as relevant to SIG Release. area/apiserver area/cloudprovider area/code-generation area/dependency Issues or PRs related to dependency changes area/kubectl sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 8, 2019
@dims
Copy link
Member Author

dims commented Aug 8, 2019

@liggitt the first step i was told was to run /test pull-kubernetes-e2e-gce-large-performance which i triggered here

@dims
Copy link
Member Author

dims commented Aug 8, 2019

/test pull-kubernetes-e2e-gce-large-performance

@liggitt
Copy link
Member

liggitt commented Aug 8, 2019

kubernetes/klog@bf4884f - Fix the log duplication issue for --log-file

was any functional test added to verify kubernetes/klog@bf4884f fixed the duplication issue?

@dims
Copy link
Member Author

dims commented Aug 8, 2019

@liggitt tested this manually


if logging.logFile != "" {
// Since we are using a single log file, all of the items in l.file array
// will point to the same file, so just use one of them to write data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are using a single log file, all of the items in l.file array
will point to the same file, so just use one of them to write data.

is that accurate if SetOutputBySeverity was called?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt i should probably add godoc comments that when using SetOutput* they should avoid setting any command line params.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if CLI flags and SetOutputBySeverity are mutually exclusive, and kube doesn't use SetOutputBySeverity, that's probably ok. Definitely needs documenting

@liggitt
Copy link
Member

liggitt commented Aug 8, 2019

@liggitt tested this manually

a follow-up (in klog) that exercises this in a test would be good. this is fragile enough that I could see it silently regressing

@liggitt
Copy link
Member

liggitt commented Aug 8, 2019

/lgtm
/approve

/hold
for scale test input

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 8, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 8, 2019
@dims
Copy link
Member Author

dims commented Aug 8, 2019

@liggitt added kubernetes/klog#85 and kubernetes/klog#84 as follow ups

@dims
Copy link
Member Author

dims commented Aug 8, 2019

@liggitt @wojtek-t @mborsz as i understand it, this PR and #78466 need to be tested together in a 5k environment for us to see the regression. right?

@dims
Copy link
Member Author

dims commented Aug 8, 2019

/test pull-kubernetes-e2e-gce-large-performance

@mborsz
Copy link
Member

mborsz commented Aug 9, 2019

In the previous attempt, the klog update itself (without actually switching to --log-file flag) was a root cause of significant performance regression.

I think it would be good to test this in 5k environment before submission.

@dims
Copy link
Member Author

dims commented Aug 9, 2019

@mborsz ack agree! @wojtek-t mentioned that he can start a 5k run over the next few days

@wojtek-t
Copy link
Member

wojtek-t commented Aug 9, 2019

@dims @liggitt - I've kicked this tests and after less than an hour (~1 minute after first logrotation) kube-apiserver started crashlooping.

I will try to run it one more time, but our suspicion with @mborsz is that it may actually be exactly the same problem.

/hold

@dims
Copy link
Member Author

dims commented Aug 9, 2019

@wojtek-t did you try just this PR like @mborsz suggested above? (don't introduce #78466)

@wojtek-t
Copy link
Member

@wojtek-t did you try just this PR like @mborsz suggested above? (don't introduce #78466)

I wanted to run the previous experiment one more time to ensure it wasn't flake.
I just did that and the result was exactly the same - apiserver started crashlooping immediatley after first log-rotation. So that definitely wasn't flake.

I didn't yet try only this PR - let me try doing that now.
But even if it will work, it doesn't seem to visible help with going towards distroless, right?

@dims
Copy link
Member Author

dims commented Aug 12, 2019

@wojtek-t it will give me more information on where the problem could be. "immediately after first log-rotation" seems like a pretty big hint at this point. I just want to confirm that without the distroless just this PR works fine.

@liggitt
Copy link
Member

liggitt commented Aug 12, 2019

But even if it will work, it doesn't seem to visible help with going towards distroless, right?

That would seem to indicate the issue is not a performance issue, but an actual bug in the log rotation code (possibly when combined with the distroless environment... unexpectedly readonly filesystems, etc?)

@dims
Copy link
Member Author

dims commented Aug 12, 2019

@liggitt exactly!

@wojtek-t
Copy link
Member

That would seem to indicate the issue is not a performance issue, but an actual bug in the log rotation code (possibly when combined with the distroless environment... unexpectedly readonly filesystems, etc?)

I think that it can still be performance issue, just with distroless we're configuring stuff differently and using a different code-path.

That said, I've run scalability tests against just this PR (5k-node cluster) and the results look fine (i.e. they are comparable with head).
So from scalability POV I'm fine with this PR (as long as we don't merge the distroless one).

@dims @liggitt - feel free to hold cancel.

@wojtek-t wojtek-t self-assigned this Aug 13, 2019
@dims
Copy link
Member Author

dims commented Aug 13, 2019

@wojtek-t agree. I'll start looking at the code path for rotation in klog and file follow up PR(s). Can you please point me to logs from the broken run (with distroless) please?

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 13, 2019
@k8s-ci-robot k8s-ci-robot merged commit 461b2d8 into kubernetes:master Aug 13, 2019
wking pushed a commit to wking/kubernetes that referenced this pull request Jul 21, 2020
….4.0

Update to latest klog 0.4.0

Kubernetes-commit: 461b2d8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/cloudprovider area/code-generation area/code-organization Issues or PRs related to kubernetes code organization area/dependency Issues or PRs related to dependency changes area/kubectl cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/release Categorizes an issue or PR as relevant to SIG Release. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants