Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tear down events #9

Merged
merged 4 commits into from
Jan 28, 2019

Conversation

sttts
Copy link
Contributor

@sttts sttts commented Jan 24, 2019

  • send out bootstrap-success event before tear down
  • optionally wait for event given to --tear-down-event.

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 24, 2019
@sttts
Copy link
Contributor Author

sttts commented Jan 24, 2019

Untested.

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 24, 2019
@sttts
Copy link
Contributor Author

sttts commented Jan 24, 2019

/assign @wking @mfojtik

@openshift-ci-robot
Copy link

@sttts: GitHub didn't allow me to assign the following users: wking.

Note that only openshift members and repo collaborators can be assigned and that issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @wking @mfojtik

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sttts sttts force-pushed the sttts-tear-down-events branch 3 times, most recently from dd61940 to b0f0b5f Compare January 24, 2019 21:34
@@ -72,6 +74,11 @@ func (b *startCommand) Run() error {
return err
}

// notify installer that we are ready to tear down the temporary bootstrap control plane
if _, err := client.CoreV1().Events("kube-system").Create(makeBootstrapSuccessEvent("kube-system", "bootstrap-success")); err != nil {
Copy link
Member

@wking wking Jan 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The installer uses bootstrap-complete for this. But instead of hard-coding this, can we make it an option (--pods-available-event?)? That way the installer can gracefully transition from it's direct bootstrap-complete injection to a cluster-bootstrap-injected bootstrap-complete without having runs doubling up on that event. If the option was not set, then cluster-bootstrap would not inject an event on pod completion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can make that configurable. But note that the bootstrap-success event is always sent, and is just ignored by missing code in the installer now. The only support you have to enable explicitly (which changes the cluster-bootstrap behaviour) is --wait-for-event-before-tear-down.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, I'm expecting this event to be able to replace the current installer-injected event. I have a WIP installer shuffle to make that work; we'll see how it goes ;).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically you still have to wait to bootstrap-complete because only then you are sure that everything has been shut down.

@sttts sttts force-pushed the sttts-tear-down-events branch 2 times, most recently from 41032bc to 8564ef9 Compare January 25, 2019 09:49
pkg/start/start.go Outdated Show resolved Hide resolved
bootstrapPodsRunningTimeout = 20 * time.Minute

// how long we wait for the installer to send a tear down event after we had sent the bootstrap-success event
tearDownEventWaitTimeout = 30 * time.Minute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have these timeouts? I'd expect cluster-bootstrap to wait forever in both cases, with timeouts being set in the installer and CI tooling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also wait infinitely, have no strong opinion. In either case we are in bad state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In either case we are in bad state.

Yup. But in the wait-indefinitely case, live debugging may be easier, because you stay in a similar bad state while you look around (vs. having cluster-bootstrap keel over on you in addition to whatever else went wrong).

@sttts sttts force-pushed the sttts-tear-down-events branch 2 times, most recently from 8565f72 to d04e03c Compare January 25, 2019 10:15
@wking
Copy link
Member

wking commented Jan 25, 2019

Looks good to me :)

@wking
Copy link
Member

wking commented Jan 25, 2019

/lgtm

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sttts, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

@wking: changing LGTM is restricted to assignees, and only openshift/cluster-bootstrap repo collaborators may be assigned issues.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sttts sttts added the lgtm Indicates that a PR is ready to be merged. label Jan 25, 2019
@sttts sttts closed this Jan 25, 2019
@sttts sttts deleted the sttts-tear-down-events branch January 25, 2019 16:36
@sttts sttts restored the sttts-tear-down-events branch January 25, 2019 17:00
@sttts
Copy link
Contributor Author

sttts commented Jan 25, 2019

@mfojtik @wking fixed typo in flag definition. Please retag.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

7 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@sttts
Copy link
Contributor Author

sttts commented Jan 26, 2019

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@sttts sttts added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Jan 26, 2019
@openshift-ci-robot
Copy link

New changes are detected. LGTM label has been removed.

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 26, 2019
@sttts
Copy link
Contributor Author

sttts commented Jan 28, 2019

/retest

@sttts sttts added the lgtm Indicates that a PR is ready to be merged. label Jan 28, 2019
@openshift-merge-robot openshift-merge-robot merged commit e0cc400 into openshift:master Jan 28, 2019
wking added a commit to wking/openshift-installer that referenced this pull request Jan 29, 2019
…own-event

In master there's a bit of a window during the bootstrap-teardown
dance:

1. cluster-bootstrap sees the requested pods.
2. cluster-bootstrap shuts itself down.
3. openshift.sh pushes the OpenShift-specific manifests.
4. report-progress.sh pushes bootstrap-complete.
5. The installer sees bootstrap-complete and removes the bootstrap
   resources, including the bootstrap load-balancer targets.
6. subsequent Kubernetes API traffic hits the production control
   plane.

That leaves a fairly large window from 3 through 5 where Kubernetes
API requests could be routed to the bootstrap machine and dropped
because it no longer has anything listening on 6443.

With this commit, we take advantage of
openshift/cluster-bootstrap@d07548e3 (Add --tear-down-event flag to
delay tear down, 2019-01-24, openshift/cluster-bootstrap#9) to drop
step 2 (waiting for an event we never send).  That leaves the
bootstrap control-plane running until we destroy that machine.  We
take advantage of openshift/cluster-bootstrap@180599bc
(pkg/start/asset: Add support for post-pod-manifests, 2019-01-29,
openshift/cluster-bootstrap#13) to replace our previous openshift.sh
(with a minor change to the manifest directory).  And we take
advantage of openshift/cluster-bootstrap@e5095848 (Create
bootstrap-success event before tear down, 2019-01-24,
openshift/cluster-bootstrap#9) to replace our previous
report-progress.sh (with a minor change to the event name).

Also set --strict, because we want to fail-fast for these resources.
The user is unlikely to scrape them out of the installer state and
push them by hand if we fail to push them from the bootstrap node.

With these changes, the new transition is:

1. cluster-bootstrap sees the requested pods.
2. cluster-bootstrap pushes the OpenShift-specific manifests.
3. cluster-bootstrap pushes bootstrap-success.
4. The installer sees bootstrap-success and removes the bootstrap
   resources, including the bootstrap load-balancer targets.
5. subsequent Kubernetes API traffic hits the production control
   plane.

There's still a small window for lost Kubernetes API traffic:

* The Terraform tear-down could remove the bootstrap machine before it
  removes the bootstrap load-balancer target, leaving the target
  pointing into empty space.
* Bootstrap teardown does not allow existing client connections to
  drain after removing the load balancer target before removing the
  bootstrap machine.

Both of these could be addressed by:

1. Remove the bootstrap load-balancer targets.
2. Wait for the 30 seconds (healthy_threshold * interval for our
   aws_lb_target_group [1]) for the load-balancer to notice the
   production control-plane targets are live.  This assumes the
   post-pod manifests are all pushed in zero seconds, so it's overly
   conservative, but waiting an extra 30 seconds isn't a large cost.
3. Remove the remaining bootstrap resources, including the bootstrap
   machine.

But even without that delay, this commit reduces the window compared
to what we have in master.  I'll land the delay in follow-up work.

[1]: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-health-checks.html
@sttts sttts deleted the sttts-tear-down-events branch April 2, 2019 12:10
wking added a commit to wking/cluster-bootstrap that referenced this pull request Apr 17, 2019
And plumb through contexts from runCmdStart so we can drop the
context.TODO() calls.

bootstrapPodsRunningTimeout was added in d07548e (Add
--tear-down-event flag to delay tear down, 2019-01-24, openshift#9), although
Stefan had no strong opinion on them then [1].  But as it stands, a
hung pod creates loops like [2]:

  $ tar xf log-bundle.tar.gz
  # cd bootstrap/journals
  $ grep 'Started Bootstrap\|Error: error while checking pod status' bootkube.log
  Apr 16 17:46:23 ip-10-0-4-87 systemd[1]: Started Bootstrap a Kubernetes cluster.
  Apr 16 18:12:41 ip-10-0-4-87 bootkube.sh[1510]: Error: error while checking pod status: timed out waiting for the condition
  Apr 16 18:12:41 ip-10-0-4-87 bootkube.sh[1510]: Error: error while checking pod status: timed out waiting for the condition
  Apr 16 18:12:46 ip-10-0-4-87 systemd[1]: Started Bootstrap a Kubernetes cluster.
  Apr 16 18:33:02 ip-10-0-4-87 bootkube.sh[11418]: Error: error while checking pod status: timed out waiting for the condition
  Apr 16 18:33:02 ip-10-0-4-87 bootkube.sh[11418]: Error: error while checking pod status: timed out waiting for the condition
  Apr 16 18:33:07 ip-10-0-4-87 systemd[1]: Started Bootstrap a Kubernetes cluster.

Instead of having systemd keep kicking bootkube.sh (which in turn
keeps launching cluster-bootstrap), removing this timeout will just
leave cluster-bootstrap running while folks gather logs from the
broken cluster.  And the less spurious-restart noise there is in those
logs, the easier it will be to find what actually broke.

[1]: openshift#9 (comment)
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1700504#c14
wking added a commit to wking/cluster-bootstrap that referenced this pull request Apr 17, 2019
And plumb through contexts from runCmdStart so we can drop the
context.TODO() calls.

bootstrapPodsRunningTimeout was added in d07548e (Add
--tear-down-event flag to delay tear down, 2019-01-24, openshift#9), although
Stefan had no strong opinion on them then [1].  But as it stands, a
hung pod creates loops like [2]:

  $ tar xf log-bundle.tar.gz
  $ cd bootstrap/journals
  $ grep 'Started Bootstrap\|Error: error while checking pod status' bootkube.log
  Apr 16 17:46:23 ip-10-0-4-87 systemd[1]: Started Bootstrap a Kubernetes cluster.
  Apr 16 18:12:41 ip-10-0-4-87 bootkube.sh[1510]: Error: error while checking pod status: timed out waiting for the condition
  Apr 16 18:12:41 ip-10-0-4-87 bootkube.sh[1510]: Error: error while checking pod status: timed out waiting for the condition
  Apr 16 18:12:46 ip-10-0-4-87 systemd[1]: Started Bootstrap a Kubernetes cluster.
  Apr 16 18:33:02 ip-10-0-4-87 bootkube.sh[11418]: Error: error while checking pod status: timed out waiting for the condition
  Apr 16 18:33:02 ip-10-0-4-87 bootkube.sh[11418]: Error: error while checking pod status: timed out waiting for the condition
  Apr 16 18:33:07 ip-10-0-4-87 systemd[1]: Started Bootstrap a Kubernetes cluster.

Instead of having systemd keep kicking bootkube.sh (which in turn
keeps launching cluster-bootstrap), removing this timeout will just
leave cluster-bootstrap running while folks gather logs from the
broken cluster.  And the less spurious-restart noise there is in those
logs, the easier it will be to find what actually broke.

[1]: openshift#9 (comment)
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1700504#c14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants