Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatefulSet delete behavior changed in v1.11 #68627

Closed
rezroo opened this issue Sep 13, 2018 · 23 comments
Closed

StatefulSet delete behavior changed in v1.11 #68627

rezroo opened this issue Sep 13, 2018 · 23 comments
Assignees
Labels
area/stateful-apps kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/apps Categorizes an issue or PR as relevant to SIG Apps.

Comments

@rezroo
Copy link

rezroo commented Sep 13, 2018

/kind bug

What happened:
In k8s v1.11.1 all statefulset pods are deleted at the same time. This is what I observe:

stack@master:~$ kubectl get pod -w
NAME           READY     STATUS    RESTARTS   AGE
echoserver-0   0/1       Pending   0          0s
echoserver-0   0/1       Pending   0         0s
echoserver-0   0/1       ContainerCreating   0         0s
echoserver-0   0/1       ContainerCreating   0         0s
echoserver-0   1/1       Running   0         1s
echoserver-1   0/1       Pending   0         0s
echoserver-1   0/1       Pending   0         0s
echoserver-1   0/1       ContainerCreating   0         0s
echoserver-1   0/1       ContainerCreating   0         0s
echoserver-1   1/1       Running   0         1s
echoserver-2   0/1       Pending   0         0s
echoserver-2   0/1       Pending   0         0s
echoserver-2   0/1       ContainerCreating   0         0s
echoserver-2   0/1       ContainerCreating   0         0s
echoserver-2   1/1       Running   0         1s
echoserver-0   1/1       Terminating   0         22s
echoserver-2   1/1       Terminating   0         20s
echoserver-1   1/1       Terminating   0         21s
echoserver-1   0/1       Terminating   0         1m
echoserver-0   0/1       Terminating   0         1m
echoserver-0   0/1       Terminating   0         1m
echoserver-0   0/1       Terminating   0         1m
echoserver-2   0/1       Terminating   0         1m
echoserver-2   0/1       Terminating   0         1m
echoserver-2   0/1       Terminating   0         1m
echoserver-2   0/1       Terminating   0         1m
echoserver-1   0/1       Terminating   0         1m
echoserver-1   0/1       Terminating   0         1m

What you expected to happen:
In k8s v1.9.3 StatefulSet deletion adheres to documentation and follows Ordered, graceful deletion and termination. In v1.9.3 when I delete the statefulset this is what I see:

stack@master:~$ kubectl get pod -w
NAME           READY     STATUS    RESTARTS   AGE
echoserver-0   1/1       Running   0          22s
echoserver-1   1/1       Running   0          15s
echoserver-2   1/1       Running   0          9s
echoserver-2   1/1       Terminating   0         29s
echoserver-2   0/1       Terminating   0         30s
echoserver-2   0/1       Terminating   0         38s
echoserver-2   0/1       Terminating   0         38s
echoserver-1   1/1       Terminating   0         44s
echoserver-1   0/1       Terminating   0         45s
echoserver-1   0/1       Terminating   0         55s
echoserver-1   0/1       Terminating   0         55s
echoserver-0   1/1       Terminating   0         1m
echoserver-0   0/1       Terminating   0         1m
echoserver-0   0/1       Terminating   0         1m
echoserver-0   0/1       Terminating   0         1m

How to reproduce it (as minimally and precisely as possible):
Create, the delete the following:

apiVersion: v1
kind: Service
metadata:
  name: echoserver
  labels:
    app: echoserver
spec:
  ports:
  - port: 80
    targetPort: 8080
  clusterIP: None
  selector:
    app: echoserver
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: echoserver
spec:
  serviceName: echoserver
  replicas: 3
  selector:
    matchLabels:
      app: echoserver
  template:
    metadata:
      labels:
        app: echoserver
    spec:
      terminationGracePeriodSeconds: 120
      containers:
      - image: alpine
        name: alpine
        command:
          - sleep
          - '70'
        imagePullPolicy: IfNotPresent

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:43:26Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
    NAME="Ubuntu"
    VERSION="16.04.2 LTS (Xenial Xerus)"
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 13, 2018
@krmayankk
Copy link

/sig apps

@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 13, 2018
@krmayankk
Copy link

It seems the default value of pod management policy changed . You can choose parallel and see f that fixes it https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#pod-management-policy

@rezroo
Copy link
Author

rezroo commented Sep 13, 2018

$ kubectl get statefulsets.apps echoserver -o yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  creationTimestamp: 2018-09-13T22:21:22Z
  generation: 1
  name: echoserver
  namespace: default
  resourceVersion: "1935"
  selfLink: /apis/apps/v1/namespaces/default/statefulsets/echoserver
  uid: 58639393-b7a3-11e8-9df7-02f13ec36f84
spec:
  podManagementPolicy: OrderedReady
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: echoserver
  serviceName: echoserver
  template:

Creation:

echoserver-0   0/1       Pending   0         0s        <none>    <none>
echoserver-0   0/1       Pending   0         0s        <none>    node2
echoserver-0   0/1       ContainerCreating   0         0s        <none>    node2
echoserver-0   0/1       ContainerCreating   0         1s        <none>    node2
echoserver-0   1/1       Running   0         1s        192.168.2.5   node2
echoserver-1   0/1       Pending   0         0s        <none>    <none>
echoserver-1   0/1       Pending   0         0s        <none>    node1
echoserver-1   0/1       ContainerCreating   0         0s        <none>    node1
echoserver-1   0/1       ContainerCreating   0         1s        <none>    node1
echoserver-1   1/1       Running   0         2s        192.168.1.5   node1
echoserver-2   0/1       Pending   0         0s        <none>    <none>
echoserver-2   0/1       Pending   0         0s        <none>    node2
echoserver-2   0/1       ContainerCreating   0         0s        <none>    node2
echoserver-2   0/1       ContainerCreating   0         0s        <none>    node2
echoserver-2   1/1       Running   0         1s        192.168.2.6   node2

Deletion:

echoserver-2   1/1       Terminating   0         2m        192.168.2.6   node2
echoserver-1   1/1       Terminating   0         2m        192.168.1.5   node1
echoserver-0   1/1       Terminating   0         2m        192.168.2.5   node2
echoserver-2   0/1       Terminating   0         2m        192.168.2.6   node2
echoserver-0   0/1       Terminating   0         2m        192.168.2.5   node2
echoserver-1   0/1       Terminating   0         2m        192.168.1.5   node1
echoserver-0   0/1       Terminating   0         2m        192.168.2.5   node2
echoserver-0   0/1       Terminating   0         2m        192.168.2.5   node2
echoserver-2   0/1       Terminating   0         2m        192.168.2.6   node2
echoserver-2   0/1       Terminating   0         2m        192.168.2.6   node2
echoserver-1   0/1       Terminating   0         2m        192.168.1.5   node1
echoserver-1   0/1       Terminating   0         2m        192.168.1.5   node1

@liggitt
Copy link
Member

liggitt commented Sep 14, 2018

It seems the default value of pod management policy changed

I don't see a default change between v1beta1 and v1

@jlegrone
Copy link
Contributor

/area stateful-apps

@Pingan2017
Copy link
Member

Pingan2017 commented Sep 17, 2018

maybe the reason is the default deleteoption for kubectl delete changed
before 1.11, it is Foreground , after it is Background
#65908

@Pingan2017
Copy link
Member

/sig cli

@k8s-ci-robot k8s-ci-robot added the sig/cli Categorizes an issue or PR as relevant to SIG CLI. label Sep 17, 2018
@liggitt
Copy link
Member

liggitt commented Sep 17, 2018

before 1.11, it is Foreground , after it is Background

That change was made between 1.11.0 and 1.11.1

@mortent
Copy link
Member

mortent commented Sep 23, 2018

I think this changed with the removal of reapers in #63979. There used to be a reaper for StatefulSet that took care of orderly and graceful termination from the client, but not termination of pods are done by the garbage collector.
This behavior changed in 1.11.0, but depends upon the client version and not the cluster version. Running a 1.10 client against a 1.11 cluster has the expected behavior.

@liggitt
Copy link
Member

liggitt commented Sep 23, 2018

This behavior changed in 1.11.0

If behavior is represented in the API, it should not depend on a specific client-side implementation for that behavior.

There used to be a reaper for StatefulSet that took care of orderly and graceful termination from the client, but not termination of pods are done by the garbage collector.

It is not clear from the documentation of that field that it applies to deletion of the statefulset, only to scale-down. If it is intended to apply to statefulset deletion as well, it must be implemented via finalizers/controller so that the field is honored when any client does a single cascading delete (as kubectl does in 1.11, but as any other client could always have done).

@liggitt
Copy link
Member

liggitt commented Sep 23, 2018

/remove-sig cli

@k8s-ci-robot k8s-ci-robot removed the sig/cli Categorizes an issue or PR as relevant to SIG CLI. label Sep 23, 2018
@liggitt
Copy link
Member

liggitt commented Sep 23, 2018

@smarterclayton @kubernetes/sig-apps-api-reviews for question about statefulset deletion behavior and the podManagementPolicy API field

@k8s-ci-robot k8s-ci-robot added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Sep 23, 2018
@smarterclayton
Copy link
Contributor

Stateful set deletion never had any guarantees, like deployments or daemonsets. Consumers must scale to zero and wait for ack. It’s unfortunate that the removal of reapers exposed this to end users.

I think it’s reasonable to consider changes to Statefulsets to offer a simpler path for controlled shutdown within the controller.

@krmayankk
Copy link

Ah I missed this was happening for deletion. And not rolling update . What is the guarantee for deployments ?

@mortent
Copy link
Member

mortent commented Sep 25, 2018

I have created a PR to update the documentation to reflect the current behavior and guarantees provided by StatefulSets: kubernetes/website#10380

I discussed this with @enisoc and @kow3ns, and there probably is a way to do this using a custom finalizer to prevent the garbage collector from deleting the pods until the StatefulSet controller can scale down to 0. It gets a little more complicated when considering upgrades and rollbacks, but we think it can be solved.

But it is not clear that the benefits of adding this outweighs the cost. It is pretty easy to work around this by simply scaling down the StatefulSet before deleting it. We think that with updated documentation, we can keep the current behavior and revisit it if there demand for a better solution.

@kow3ns
Copy link
Member

kow3ns commented Sep 27, 2018

/assign kow3ns

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2018
@krmayankk
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 27, 2018
@krmayankk
Copy link

/assign @krmayankk

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 27, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 26, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/stateful-apps kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/apps Categorizes an issue or PR as relevant to SIG Apps.
Projects
None yet
Development

No branches or pull requests

10 participants