Skip to content

Commit

Permalink
Fix bug with hung GameServer resource on Kubernetes 1.10
Browse files Browse the repository at this point in the history
It looks like the `foregroundDeletion` finalizer that gets added to the
GameServer when a deletion with foreground propogation is requested -
when it gets removed, by the Kubernetes system, the ResourceVersion/Generation
doesn't get incremented.

This means when we do an `Update` it will go through (usually it fails if the
ResourceVersion/Generation has changed since the last update), rather than
failing and going back to re-sync.

This can cause a race condition where the `Update` can basically put back
the `foregroundDeletion` finalizer, if it gets removed between retrieving the
`GameServer` object and doing the `Update()` operation.

We actually don't need the `foregroundDeletion` propogation on `Delete()`, as
we only want to keep the `GameServer` around until the backing `Pod` is gone,
so we can actually just turn this off instead in the code.

As a note - also explored using a `Patch()` operation to try and solve this
problem. AFAIK, a patch (and specifically a JSONPatch - other patch types
didn't seem better), can only do index based removal from a list of items.

This could just present other types of race conditions, as finalizers could
be added/removed by other systems at the same time as well, changing the index
to the finalizer we want at the same time.

Not sure if this is a bug in the Kubernetes system, or working as intended,
but this solves the issue.
  • Loading branch information
markmandel committed Jul 9, 2018
1 parent d93627c commit d3e9d7e
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions pkg/gameservers/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -510,8 +510,9 @@ func (c *Controller) syncGameServerShutdownState(gs *stablev1alpha1.GameServer)
}

c.logger.WithField("gs", gs).Info("Syncing Shutdown State")
// Do it in the foreground, so the GameServer gets killed last
p := metav1.DeletePropagationForeground
// be explicit about where to delete. We only need to wait for the Pod to be removed, which we handle with our
// own finalizer.
p := metav1.DeletePropagationBackground
err := c.gameServerGetter.GameServers(gs.ObjectMeta.Namespace).Delete(gs.ObjectMeta.Name, &metav1.DeleteOptions{PropagationPolicy: &p})
if err != nil {
return errors.Wrapf(err, "error deleting Game Server %s", gs.ObjectMeta.Name)
Expand Down

0 comments on commit d3e9d7e

Please sign in to comment.