Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended load - StatefulSets and PVs. #716

Closed
wants to merge 1 commit into from

Conversation

mm4tt
Copy link
Contributor

@mm4tt mm4tt commented Jul 31, 2019

Ref. #704

As explained in one TODO, PVs are created by StatefulSets but are not deleted until namespaces are deleted (so after test, after all measurements are stopped). This is something we should change.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 31, 2019
@mm4tt mm4tt closed this Jul 31, 2019
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 31, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mm4tt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 31, 2019
@mm4tt mm4tt reopened this Jul 31, 2019
@k8s-ci-robot k8s-ci-robot requested review from mborsz and oxddr July 31, 2019 09:48
@mm4tt mm4tt force-pushed the extended_load_ss_pv branch from 231e460 to 4ef02b5 Compare July 31, 2019 09:48
@mm4tt
Copy link
Contributor Author

mm4tt commented Jul 31, 2019

/test pull-perf-tests-clusterloader2

@mm4tt
Copy link
Contributor Author

mm4tt commented Jul 31, 2019

/test pull-perf-tests-clusterloader2-kubemark

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 10, 2019

# StatefulSets
{{$STATEFUL_SET_SIZE := 5}}
{{$STATEFUL_SETS_PER_NAMESPACE := 2}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently there is a feature for parallel-pod-management in statefulset:
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#parallel-pod-management

So I think we should use it and have one of each size per namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

# measure api call latency for that. There are two options here:
# 1. Create a new measurement that will delete PVs created by StatefulSets
# 2. Create PVs manually, attach them here (but not via volumeClaimTemplates) and delete them
# manually later
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think that there is a third option which is faster and simpler to do (and I would vote for proceeding this path at least for now).
PVCs connected to StatefulSet have naming schema: "-statefulset-name>-".

So we can actually create a template file for PVC and we will simply add a step to delete PVCs that would look like:

    namespaceRange:
      min: 1
      max: {{$namespaces}}
    replicasPerNamespace: 0
    tuningSet: RandomizedSaturationTimeLimited
    objectBundle:
      - basename: pvname-ssname
        objectTemplatePath: pvc.yaml

If at some point we decide for more pvc-per-pod or more statefulset-per-namespace, we can opaque it in range (ugly but works - I don't we need it immediately).

The only remaining problem we have is that we enumrate from 1 to N, and StatefulSet from 0 to N-1.
But I think we can change that universally for CL, to actually use 0..N-1 scheme (I think that's reasonable on itself).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, but from what I managed to get from code it looks like we're enumerating from 0 in CL2 already. Please correct me if I'm wrong

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed f2f, unfortunately this wouldn't work, as CL2 is not able to delete objects that it didn't create. Long story short it keeps it own state of all objects that were created/modified. When we try to delete the PVCs, CL2 doesn't know about them and assumes that none of them exist and doesn't do anything when deleting.

Leaving the TODO as it was, will tackle it in a separate PR.

action: start
apiVersion: apps/v1
kind: StatefulSet
labelSelector: group = statefulset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is pretty much only selecting controlling objects (so statefulsets).

So for consistency with deployments (and probably other in the near future), I vote for making that:

group = load

This would mean that they will also be measured as part of pod-startup-time and all stuff that was assuming that all pods actually have group = load label.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, done.

@mm4tt mm4tt force-pushed the extended_load_ss_pv branch from 4ef02b5 to b56c389 Compare September 3, 2019 14:27
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 3, 2019
Copy link
Contributor Author

@mm4tt mm4tt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

Let's wait for kubernetes/test-infra#14174, #772 and #767 to get. One is required to properly roll-out this experiment, two other are baselines that will make reviewing this easier.


# StatefulSets
{{$STATEFUL_SET_SIZE := 5}}
{{$STATEFUL_SETS_PER_NAMESPACE := 2}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

action: start
apiVersion: apps/v1
kind: StatefulSet
labelSelector: group = statefulset
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, done.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 3, 2019
@@ -8,23 +8,36 @@
{{$NODE_MODE := DefaultParam .NODE_MODE "allnodes"}}
{{$NODES_PER_NAMESPACE := DefaultParam .NODES_PER_NAMESPACE 100}}
{{$PODS_PER_NODE := DefaultParam .PODS_PER_NODE 30}}
{{$LOAD_TEST_THROUGHPUT := DefaultParam .LOAD_TEST_THROUGHPUT 10}}
{{$LOAD_TEST_THROUGHPUT := DefaultParam .LOAD_TEST_THROUGHPUT 100}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert

{{$BIG_GROUP_SIZE := 250}}
{{$MEDIUM_GROUP_SIZE := 30}}
{{$SMALL_GROUP_SIZE := 5}}
{{$STATEFUL_SET_SIZE := 10}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently one of my comments disappeared in the meantime.
What I was suggesting that with the ability for parallel-pod-management, we may want to create 1 of each: small, medium and large StatefulSet per namespace instead of a corresponding deployment.

This would also simplify the logic below (i.e. you would leave everything as was before and do:

if ENABLE_STATEFUL_SETS {
if small-deployments > 0 {
  small-deployments--
  small-statefulsets++
}
if medium-deployments > 0 {
 ...
...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also fine with (as a first step) getting rid of stateful-sets-per-namespace and explicitly assuming 1 of each.


# TODO: Remove
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@mm4tt mm4tt force-pushed the extended_load_ss_pv branch from b56c389 to 0de2ffe Compare September 4, 2019 09:58
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 4, 2019
@mm4tt mm4tt force-pushed the extended_load_ss_pv branch from 0de2ffe to 159b650 Compare September 4, 2019 10:01
@k8s-ci-robot
Copy link
Contributor

@mm4tt: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-perf-tests-clusterloader2 159b650 link /test pull-perf-tests-clusterloader2

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

cpu: 10m
memory: "10M"
{{if $EnablePVs}}
# TODO(#704): We should have better control over deleting PVs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned offline - let's maybe first do only StatefulSets and we will extend with PVs in a separate step.

@mm4tt
Copy link
Contributor Author

mm4tt commented Sep 4, 2019

Closing this one, will add StatefulSets separately in #776 and then tackle PVs in a separate PR.

@mm4tt mm4tt closed this Sep 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants