Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy static pod logs to systemd via --send-static-pod-logs-to-systemd #11

Closed

Conversation

sttts
Copy link
Contributor

@sttts sttts commented Jan 24, 2019

We want to get static pod logs when the installer fails for some reason. We already scrape bootkube, cri-o and kubelet. This change will "copy" the cri-o container logs to systemd unit via systemd-run --unit <pod-name> crictl logs <container-id>. After that we can scrape it from the CI job through journald.

Follow-up of openshift/release#2633.

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 24, 2019
@sttts sttts force-pushed the sttts-copy-static-pod-log-to-systemd branch from 6b110cf to 1cc910c Compare January 24, 2019 21:30
@sttts
Copy link
Contributor Author

sttts commented Jan 24, 2019

@smarterclayton @wking which do you prefer: sending static pods logs to systemd with systemd-run crictl logs ... (implemented here, ready to scrape) or just logging all static pod logs (indented) into bootkube.service ?

@sttts sttts force-pushed the sttts-copy-static-pod-log-to-systemd branch from 1cc910c to abcf9b1 Compare January 24, 2019 21:35
@smarterclayton
Copy link

smarterclayton commented Jan 24, 2019 via email

pkg/start/start.go Outdated Show resolved Hide resolved
pkg/start/start.go Outdated Show resolved Hide resolved
if len(name) == 0 {
continue
}
podName, _, _ := unstructured.NestedString(container, "labels", "io.kubernetes.pod.name")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, this can be missing (for non-pod containers). Emptyness check is what we want.

@sttts sttts force-pushed the sttts-copy-static-pod-log-to-systemd branch 5 times, most recently from 4f9859f to 5d6231f Compare January 25, 2019 10:15
@sttts sttts closed this Jan 25, 2019
@sttts sttts deleted the sttts-copy-static-pod-log-to-systemd branch January 25, 2019 16:36
@sttts sttts restored the sttts-copy-static-pod-log-to-systemd branch January 26, 2019 09:03
@sttts sttts reopened this Jan 26, 2019
@sttts
Copy link
Contributor Author

sttts commented Jan 26, 2019

/retest

@sttts sttts force-pushed the sttts-copy-static-pod-log-to-systemd branch from 5d6231f to ab5e0c1 Compare January 26, 2019 20:24
@sttts
Copy link
Contributor Author

sttts commented Jan 28, 2019

/retest

@wking
Copy link
Member

wking commented Feb 4, 2019

I haven't read the diff here, but it's green (if that's what you were waiting for), and I want the functionality it's advertising ;).

wking added a commit to wking/openshift-installer that referenced this pull request Feb 5, 2019
Through da6d45b (Merge pull request openshift#890 from
staebler/asset_loading_tests, 2019-02-04).

Background for the networking.type validation entry is in this thread
[1].

Forwarding static-pod longs to systemd is still in flight with [2].

[1]: openshift#1013 (comment)
[2]: openshift/cluster-bootstrap#11
wking added a commit to wking/openshift-installer that referenced this pull request Feb 5, 2019
Through da6d45b (Merge pull request openshift#890 from
staebler/asset_loading_tests, 2019-02-04).

Background for the networking.type validation entry is in this thread
[1].

The OpenStack HAProxy entry has wording based on [2] and Russell's
out-of-band suggestions.

Forwarding static-pod longs to systemd is still in flight with [3].

[1]: openshift#1013 (comment)
[2]: https://github.com/openshift/installer/pull/1185/files#r253714521
[3]: openshift/cluster-bootstrap#11
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 5, 2019
wking added a commit to wking/openshift-installer that referenced this pull request Feb 5, 2019
Through da6d45b (Merge pull request openshift#890 from
staebler/asset_loading_tests, 2019-02-04).

Background for the networking.type validation entry is in this thread
[1].

The OpenStack HAProxy entry has wording based on [2] and Russell's
out-of-band suggestions.

Forwarding static-pod longs to systemd is still in flight with [3].

[1]: openshift#1013 (comment)
[2]: https://github.com/openshift/installer/pull/1185/files#r253714521
[3]: openshift/cluster-bootstrap#11
wking added a commit to wking/openshift-installer that referenced this pull request Feb 5, 2019
Through da6d45b (Merge pull request openshift#890 from
staebler/asset_loading_tests, 2019-02-04).

Background for the networking.type validation entry is in this thread
[1].

The OpenStack HAProxy entry has wording based on [2] and Russell's
out-of-band suggestions.

Forwarding static-pod longs to systemd is still in flight with [3].

[1]: openshift#1013 (comment)
[2]: https://github.com/openshift/installer/pull/1185/files#r253714521
[3]: openshift/cluster-bootstrap#11
@wking
Copy link
Member

wking commented Feb 7, 2019

Also in this space: containers/podman#2265. Presumably any changes to conmon from that effort would eventually end up exposed by CRI-O too (cri-o/cri-o#377).

@sttts
Copy link
Contributor Author

sttts commented Feb 8, 2019

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 8, 2019
@sttts
Copy link
Contributor Author

sttts commented Feb 8, 2019

@wking it's not ready, we have to change our strategy and get the logs from kubelet. cluster-bootstrap runs in a container. And getting it to speaker to both cri-o on the host and to systemd of the host is awkward. @mfojtik will rewrite using the kubelet API. That's simpler.

@sttts sttts closed this Apr 2, 2019
@sttts sttts deleted the sttts-copy-static-pod-log-to-systemd branch April 2, 2019 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants