-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1852047: controller: Emit events #1962
Bug 1852047: controller: Emit events #1962
Conversation
Demo |
d203e2d
to
4ada695
Compare
This one is passing tests and should help us a lot trace through upgrades in the future, can I get a lgtm? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sane but will let someone with more familiarity to lgtm
This one also relates to https://bugzilla.redhat.com/show_bug.cgi?id=1852047 |
@cgwalters: This pull request references Bugzilla bug 1852047, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
4ada695
to
7162aab
Compare
7162aab
to
7961d7d
Compare
A while ago I'd invested some time in tweaking the node controller to have useful logs around what it's doing; my first "point of contact" when looking at upgrades was its pod logs. But...we lose most those on upgrade since the pod gets killed. Add events to the node controller too. Currently the MCD emits useful events which can be queried afterwards (in our CI runs we dump `events.json`). With this we can create a "journal/history" for upgrade/update events just by querying the event stream.
7961d7d
to
4e133e7
Compare
I think that |
It'd be really useful to me to have these changes in master, so I can start gathering more "baseline" data across all the clusters being launched for 4.6. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there something we should look at in the artifacts specifically to verify this?
Yep, same answer as #1977 (comment) |
We had an event when we were starting an OS update, but nothing when it was completed - one could implicitly get that by looking at the next event, but that's a bit fragile. And since then we started doing a lot more stuff with the OS, so let's add an event emitted before and after all OS changes so we can consistently get e.g. timing information about it. Relates to openshift#1962 around getting better data about timing during upgrades.
/approve /assign @runcom |
We're nearing 3 weeks to get this pretty simple patch in... |
I want to emphasize the value of getting this patch in soon - #1962 (comment) |
lgtm. |
We had an event when we were starting an OS update, but nothing when it was completed - one could implicitly get that by looking at the next event, but that's a bit fragile. And since then we started doing a lot more stuff with the OS, so let's add an event emitted before and after all OS changes so we can consistently get e.g. timing information about it. Relates to openshift#1962 around getting better data about timing during upgrades.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, kikisdeliveryservice, sinnykumari, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@cgwalters: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@cgwalters: All pull requests linked via external trackers have merged:
Bugzilla bug 1852047 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
A while ago I'd invested some time in tweaking the
node controller to have useful logs around what it's
doing; my first "point of contact" when looking at
upgrades was its pod logs. But...we lose most
those on upgrade since the pod gets killed.
Add events to the node controller too.
Currently the MCD emits useful events which
can be queried afterwards (in our CI runs we
dump
events.json
).With this we can create a "journal/history"
for upgrade/update events just by querying the
event stream.