-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scalability load test extended to exercise Deployments, DaemonSets, StatefulSets, Jobs, PersistentVolumes, Secrets, ConfigMaps, NetworkPolicies #704
Comments
/assign |
I've run a 5K node test yesterday, using the extended load scenario with enabled: Secrets, ConfigMaps, StatefulSets and PVs.
I checked Prometheus graphs for that run, and it looks like the Prometheus based api-call was broken by single spikes (happening around log rotate) in all cases: This is a known problem in Prometheus api-call latency (actually it's the reason it's currently disabled). @oxddr and @krzysied are working on this on hopefully we'll have some solution soon. So to summarize, the extended load looks very promising. Once we have a solution to the spike problems in Prometheus api-call-latency we should be good (or really close to) to enable it in CI/CD. |
After discussing with team we agreed that we should be good to enable Secrets and ConfigMaps in CI/CD tests. On the other hand it might be tricky, as currently we have a separate experimental config for extended load. |
For the record, the fact that Prometheus-based API call latency SLO was violated can be actually valid. But I understand SLO violations caused by logrotate are orthogonal to changes you did. Prometheus-based measurement is close to SLO definition and thus is more strict and prone to violations caused by spikes. |
This is most likely no-op until we turn on some Network Policy Provider that will start enfocring these network policies. It should be pretty straightforward to turn on Calico both in GKE and in GCE. This should be done separately to isolate any potential performance impact of tuning it just on. Ref. kubernetes#704
This is no-op, following the experiment rollout precedure described at https://github.com/kubernetes/perf-tests/blob/master/clusterloader2/docs/experiments.md Ref. kubernetes/perf-tests#704
This is no-op, following the experiment rollout precedure described at https://github.com/kubernetes/perf-tests/blob/master/clusterloader2/docs/experiments.md Ref. kubernetes/perf-tests#704
This also merges experimental load with the real load test. The "knob" has been enabled in presubmits in kubernetes/test-infra#14166. Ref. kubernetes#704
This also merges experimental load with the real load test. The "knob" has been enabled in presubmits in kubernetes/test-infra#14166. Ref. kubernetes#704
This is no-op, following the experiment rollout precedure described at https://github.com/kubernetes/perf-tests/blob/master/clusterloader2/docs/experiments.md Ref. kubernetes/perf-tests#704
This is no-op, following the experiment rollout precedure described at https://github.com/kubernetes/perf-tests/blob/master/clusterloader2/docs/experiments.md Ref. kubernetes/perf-tests#704
Will be keeping an eye on the next runs and rollback / disable for some jobs if needed. Ref. kubernetes/perf-tests#704
Similarly to the other new resources, will be keeping an eye on the next runs and rollback / disable for some jobs if needed. kubernetes/perf-tests#704
The only tricky part here is deleting PVs that are created via StatefulSets. Theses PVs are not automatically deleted when StatefulSets are deleted. Becasue of that I extended the ClusterLoader Phase api to allow deleting objects that weren't created directly via CL2. The way it works is that once we detect a new object if certain option is set we issue a List request to find nunmber of replicas. Ref. kubernetes#704
The only tricky part here is deleting PVs that are created via StatefulSets. Theses PVs are not automatically deleted when StatefulSets are deleted. Becasue of that I extended the ClusterLoader Phase api to allow deleting objects that weren't created directly via CL2. The way it works is that once we detect a new object if certain option is set we issue a List request to find nunmber of replicas. Ref. kubernetes#704
The only tricky part here is deleting PVs that are created via StatefulSets. Theses PVs are not automatically deleted when StatefulSets are deleted. Becasue of that I extended the ClusterLoader Phase api to allow deleting objects that weren't created directly via CL2. The way it works is that once we detect a new object if certain option is set we issue a List request to find nunmber of replicas. Ref. kubernetes#704
This is most likely no-op until we turn on some Network Policy Provider that will start enfocring these network policies. It should be pretty straightforward to turn on Calico both in GKE and in GCE. This should be done separately to isolate any potential performance impact of tuning it just on. Ref. kubernetes#704
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale We hope to get back to this in Q1 2020. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale
/lifecycle frozen
|
Implemented
Enabled in CI/CD
The text was updated successfully, but these errors were encountered: