Add support for scale subresource to Connect, S2I, MM1, MM2, Bridge and Connectors #3165

scholzj · 2020-06-06T23:09:52Z

Type of change

Enhancement / new feature

Description

This PR adds support for scale subresource. Rght now it is added to following resources:

KafkaConnect, KafkaConnectS2I and KafkaMirrorMaker2
KafkaMirrorMaker
KafkaBridge
KafkaConnector

The scale subresource is useful because it makes it easier to scale the resources and for example use different tools for autoscaling on them.

To allow support for the scale subresource, some new status fields have been added. One is a field which mirrors the corresponding number of replicas. Other one is label selector which is important for use with Horizontal Pod Autoscalers.

For all Deployment based resources, this is done by a new field .status.replicas which mirrors .spec.replicas and .status.podSelector which can be used to select the pods. KafkaConnector is an example of scaling a resource not represented by pods. In this case, the label selector is not present and the .spec.tasksMax and .status.tasksMax represent the replicas field.

The tests include a test for the CRD generator with CRD including status and scale subresources, tests of the api module that test that the scale subresource works and Cluster Operator tests testing the new status fields. To make the api tests pass I had to update kubectl to 1.16 (1.15 has a bug kubernetes/kubernetes#81342 due to which it doesn't work).

This PR also fixes the flaky test KafkaCrdIT.testKafkaWithTemplate which was causing problems while testing this.

Checklist

Write tests
Make sure all tests pass
Update documentation
Check RBAC rights for Kubernetes / OpenShift roles
Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
Update CHANGELOG.md

scholzj · 2020-06-06T23:10:53Z

@tombentley Before adding this to more resources and writting tests, I wondered what do you think abotu the code and whether you have any early comments.

tombentley

Out of interest, what metrics do you think an autoscaler for connectors would use to decide how to scale, since it's Kube has no way to attribute CPU etc to a particular connector within a multi-connector cluster?

api/src/main/java/io/strimzi/api/kafka/model/status/KafkaConnectStatus.java

crd-generator/src/main/java/io/strimzi/crdgenerator/annotations/Crd.java

crd-generator/src/main/java/io/strimzi/crdgenerator/CrdGenerator.java

...tor/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractConnectOperator.java

ppatierno

How much do you think that the scale subresource is useful for the bridge? We know that actually it's not so simple scaling the bridge and we always suggest to use different deployments.

install/strimzi-admin/010-ClusterRole-strimzi-admin.yaml

samuel-hawker

Aside from the other maintainers concerns this looks good to me!

api/src/main/java/io/strimzi/api/kafka/model/status/KafkaConnectorStatus.java

scholzj · 2020-06-09T08:44:39Z

How much do you think that the scale subresource is useful for the bridge? We know that actually it's not so simple scaling the bridge and we always suggest to use different deployments.

It could be actually quite useful when using the bridge for producing messages for example, or? That might be even something where you can use the usual autoscaling mechanisms such as CPU utilization etc.

This PR doesn't really do anything else than making it possible for the users to use the scale subresource and nothing more. That is something what the user can already do by editing the resource anyway, this is just an easier way. As with any autoscaling, the user needs to plug it in manually and think about the autoscaling rules. So I think only future will show how useful or useless this is depending on whether users find some good way to use it, but I think there is only a little harm.

ppatierno · 2020-06-09T08:53:19Z

Yes, good point. The bridge scaling problem is mostly related to consumers; using scaling for producer could make sense.
Other than CPU, if we provide HTTP metrics out of the bridge, they could be used for scaling as well (of course considering the PUT ones for sending messages on specific endpoints). I was wondering how could be interesting integration with KEDA for this.
I will take a look, it's interesting stuff for me :-)

scholzj · 2020-06-09T09:28:26Z

Other than CPU, if we provide HTTP metrics out of the bridge, they could be used for scaling as well (of course considering the PUT ones for sending messages on specific endpoints). I was wondering how could be interesting integration with KEDA for this.

Yeah ... we should have some metrics in the bridge as well. But that is missing right now. As for KEDA, I need to check with them, but from the docs my udnerstanding is that today, KEDA does not support scaling custom resources and supports only scaling of deployments. So ideally once we have this ready, we should convince them to add support for scaling any resource and not just Deployments.

ppatierno · 2020-06-09T14:45:52Z

@scholzj I had a chat with KEDA guys. They have the feature planned for KEDA v2 (in one month or so) kedacore/keda#703.
I will plan to work on adding metrics on the bridge for this.

scholzj · 2020-06-09T14:47:32Z

@ppatierno Right, this is exactly what would make these changes a bit more useful.

...tor/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractConnectOperator.java

…d KafkaConnector Signed-off-by: Jakub Scholz <www@scholzj.com>

Signed-off-by: Jakub Scholz <www@scholzj.com> Co-authored-by: Tom Bentley <tombentley@users.noreply.github.com>

Signed-off-by: Jakub Scholz <www@scholzj.com>

…ces, try to fix the CRD tests on Minikube Signed-off-by: Jakub Scholz <www@scholzj.com>

Signed-off-by: Jakub Scholz <www@scholzj.com>

…es bug #81342 Signed-off-by: Jakub Scholz <www@scholzj.com>

Signed-off-by: Jakub Scholz <www@scholzj.com>

scholzj · 2020-06-10T20:06:54Z

@ppatierno @samuel-hawker @tombentley I added tests etc. and this should now be ready for final review.

scholzj · 2020-06-10T20:24:45Z

/azp run system-tests

azure-pipelines · 2020-06-10T20:24:54Z

Azure Pipelines successfully started running 1 pipeline(s).

test/src/main/java/io/strimzi/test/k8s/cmdClient/KubeCmdClient.java

Signed-off-by: Jakub Scholz <www@scholzj.com>

scholzj · 2020-06-10T22:14:10Z

@strimzi-ci run tests

strimzi-ci · 2020-06-11T04:34:38Z

❌ Test Summary ❌

TEST_PROFILE: acceptance
EXCLUDED_GROUPS: networkpolicies
TEST_CASE: *ST
TOTAL: 25
PASS: 9
FAIL: 16
SKIP: 0
BUILD_NUMBER: 1073
BUILD_ENV: oc cluster up

❗ Test Failures ❗

io.strimzi.systemtest.bridge.HttpBridgeTlsST in io.strimzi.systemtest.bridge.HttpBridgeTlsST
io.strimzi.systemtest.oauth.OauthTlsST in io.strimzi.systemtest.oauth.OauthTlsST
testKafkaAndZookeeperScaleUpScaleDown in io.strimzi.systemtest.rollingupdate.RollingUpdateST
testAutoRenewAllCaCertsTriggeredByAnno in io.strimzi.systemtest.security.SecurityST
testCustomSoloCertificatesForRoute in io.strimzi.systemtest.kafka.ListenersST
testNodePortTls in io.strimzi.systemtest.KafkaST
testSendMessagesPlainScramSha in io.strimzi.systemtest.KafkaST
testLoadBalancerTls in io.strimzi.systemtest.KafkaST
io.strimzi.systemtest.RecoveryST in io.strimzi.systemtest.RecoveryST
testMirrorMaker2TlsAndTlsClientAuth in io.strimzi.systemtest.MirrorMaker2ST
testKafkaConnectorWithConnectS2IAndConnectWithSameName in io.strimzi.systemtest.ConnectS2IST
testProducerConsumerStreamsService in io.strimzi.systemtest.tracing.TracingST
testMirrorMakerTlsAuthenticated in io.strimzi.systemtest.MirrorMakerST
testUpdateUser in io.strimzi.systemtest.UserST
testMultiNodeKafkaConnectWithConnectorCreation in io.strimzi.systemtest.ConnectST
testKafkaConnectAndConnectorFileSinkPlugin in io.strimzi.systemtest.ConnectST

Re-run command:
@strimzi-ci run tests false profile=acceptance testcase=io.strimzi.systemtest.bridge.HttpBridgeTlsST,io.strimzi.systemtest.oauth.OauthTlsST,io.strimzi.systemtest.rollingupdate.RollingUpdateST#testKafkaAndZookeeperScaleUpScaleDown,io.strimzi.systemtest.security.SecurityST#testAutoRenewAllCaCertsTriggeredByAnno,io.strimzi.systemtest.kafka.ListenersST#testCustomSoloCertificatesForRoute,io.strimzi.systemtest.KafkaST#testNodePortTls,io.strimzi.systemtest.KafkaST#testSendMessagesPlainScramSha,io.strimzi.systemtest.KafkaST#testLoadBalancerTls,io.strimzi.systemtest.RecoveryST,io.strimzi.systemtest.MirrorMaker2ST#testMirrorMaker2TlsAndTlsClientAuth,io.strimzi.systemtest.ConnectS2IST#testKafkaConnectorWithConnectS2IAndConnectWithSameName,io.strimzi.systemtest.tracing.TracingST#testProducerConsumerStreamsService,io.strimzi.systemtest.MirrorMakerST#testMirrorMakerTlsAuthenticated,io.strimzi.systemtest.UserST#testUpdateUser,io.strimzi.systemtest.ConnectST#testMultiNodeKafkaConnectWithConnectorCreation,io.strimzi.systemtest.ConnectST#testKafkaConnectAndConnectorFileSinkPlugin

scholzj · 2020-06-11T17:58:47Z

/azp run system-tests

azure-pipelines · 2020-06-11T17:58:56Z

Azure Pipelines successfully started running 1 pipeline(s).

exherb · 2020-08-02T01:23:11Z

{
"kind": "Scale",
"apiVersion": "autoscaling/v1",
"metadata": {
"name": "kafka-connect-xapi",
"namespace": "xapi",
"selfLink": "/apis/kafka.strimzi.io/v1beta1/namespaces/xapi/kafkaconnects/kafka-connect-xapi/scale",
"uid": "f9055528-79af-4012-8026-2ac80d3b81ff",
"resourceVersion": "204389134",
"creationTimestamp": "2020-07-31T10:27:37Z"
},
"spec": {
"replicas": 3
},
"status": {
"replicas": 3
}
}

status.selector has no value?

scholzj · 2020-08-02T10:08:35Z

@exherb Hmm, looks like in the actual resource it is named .status.podSelector ... but in the CRD .status.selector 🤦‍♂️ ... I opened #3432 to fix it.

As a workaround, you should be able to to replace the labelSelectorPath: .status.selector for labelSelectorPath: .status.podSelector and repply the CRDs.

…nd Connectors (strimzi#3165) * Add support for scale subresource in KafkaConnect, its derivatives and KafkaConnector Signed-off-by: Jakub Scholz <www@scholzj.com> * Apply suggestions from code review Signed-off-by: Jakub Scholz <www@scholzj.com> Co-authored-by: Tom Bentley <tombentley@users.noreply.github.com> * Review comments Signed-off-by: Jakub Scholz <www@scholzj.com> * Add scaling to Bridge and MM Signed-off-by: Jakub Scholz <www@scholzj.com> * Make existing tests pass and fix some review comments Signed-off-by: Jakub Scholz <www@scholzj.com> * Add CRD tests Signed-off-by: Jakub Scholz <www@scholzj.com> * Regen of APi reference after rebase Signed-off-by: Jakub Scholz <www@scholzj.com> * Add more tests for setitng the right statuses in the different resources, try to fix the CRD tests on Minikube Signed-off-by: Jakub Scholz <www@scholzj.com> * Fix unused imports and improve CHANGELOG.md Signed-off-by: Jakub Scholz <www@scholzj.com> * Travis seems to be too fast? Signed-off-by: Jakub Scholz <www@scholzj.com> * Try to fix Travis race conditions Signed-off-by: Jakub Scholz <www@scholzj.com> * Try to fix Travis race conditions II Signed-off-by: Jakub Scholz <www@scholzj.com> * Debug Travis Signed-off-by: Jakub Scholz <www@scholzj.com> * Bump kubectl to 1.16.0 since the issue seems ot be caused by Kubernetes bug #81342 Signed-off-by: Jakub Scholz <www@scholzj.com> * Cleanup previous debug and fix attempts Signed-off-by: Jakub Scholz <www@scholzj.com> * Use better name in waitFor method Signed-off-by: Jakub Scholz <www@scholzj.com> Co-authored-by: Tom Bentley <tombentley@users.noreply.github.com>

scholzj added this to the 0.19.0 milestone Jun 6, 2020

scholzj requested a review from tombentley June 6, 2020 23:09

tombentley reviewed Jun 8, 2020

View reviewed changes

scholzj force-pushed the support-scale-subresource-in-our-crds branch from 9968ff7 to 6a5727d Compare June 8, 2020 21:47

ppatierno reviewed Jun 9, 2020

View reviewed changes

install/strimzi-admin/010-ClusterRole-strimzi-admin.yaml Show resolved Hide resolved

samuel-hawker reviewed Jun 9, 2020

View reviewed changes

api/src/main/java/io/strimzi/api/kafka/model/status/KafkaConnectorStatus.java Outdated Show resolved Hide resolved

samuel-hawker reviewed Jun 10, 2020

View reviewed changes

...tor/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractConnectOperator.java Show resolved Hide resolved

scholzj and others added 6 commits June 10, 2020 14:58

Add support for scale subresource in KafkaConnect, its derivatives an…

ed16b7f

…d KafkaConnector Signed-off-by: Jakub Scholz <www@scholzj.com>

Apply suggestions from code review

38ea612

Signed-off-by: Jakub Scholz <www@scholzj.com> Co-authored-by: Tom Bentley <tombentley@users.noreply.github.com>

Review comments

48f19a5

Signed-off-by: Jakub Scholz <www@scholzj.com>

Add scaling to Bridge and MM

c40d93b

Signed-off-by: Jakub Scholz <www@scholzj.com>

Make existing tests pass and fix some review comments

d3fea37

Signed-off-by: Jakub Scholz <www@scholzj.com>

Add CRD tests

8e4b00b

Signed-off-by: Jakub Scholz <www@scholzj.com>

tombentley approved these changes Jun 10, 2020

View reviewed changes

Regen of APi reference after rebase

5587b10

Signed-off-by: Jakub Scholz <www@scholzj.com>

scholzj force-pushed the support-scale-subresource-in-our-crds branch from 745ddd6 to 5587b10 Compare June 10, 2020 14:04

scholzj added 8 commits June 10, 2020 17:03

Add more tests for setitng the right statuses in the different resour…

e6f9dc3

…ces, try to fix the CRD tests on Minikube Signed-off-by: Jakub Scholz <www@scholzj.com>

Fix unused imports and improve CHANGELOG.md

c2b68d0

Signed-off-by: Jakub Scholz <www@scholzj.com>

Travis seems to be too fast?

43afe7d

Signed-off-by: Jakub Scholz <www@scholzj.com>

Try to fix Travis race conditions

8f540ed

Signed-off-by: Jakub Scholz <www@scholzj.com>

Try to fix Travis race conditions II

0ed8a51

Signed-off-by: Jakub Scholz <www@scholzj.com>

Debug Travis

e8dc917

Signed-off-by: Jakub Scholz <www@scholzj.com>

Bump kubectl to 1.16.0 since the issue seems ot be caused by Kubernet…

2dd8d70

…es bug #81342 Signed-off-by: Jakub Scholz <www@scholzj.com>

Cleanup previous debug and fix attempts

4a3bc1f

Signed-off-by: Jakub Scholz <www@scholzj.com>

scholzj changed the title ~~Add support for scale subresource~~ Add support for scale subresource to Connect, S2I, MM1, MM2, Bridge and Connectors Jun 10, 2020

scholzj marked this pull request as ready for review June 10, 2020 19:45

samuel-hawker reviewed Jun 10, 2020

View reviewed changes

test/src/main/java/io/strimzi/test/k8s/cmdClient/KubeCmdClient.java Outdated Show resolved Hide resolved

Use better name in waitFor method

c80d559

Signed-off-by: Jakub Scholz <www@scholzj.com>

tombentley approved these changes Jun 10, 2020

View reviewed changes

ppatierno approved these changes Jun 11, 2020

View reviewed changes

samuel-hawker approved these changes Jun 11, 2020

View reviewed changes

scholzj merged commit 3a06fbd into strimzi:master Jun 11, 2020

scholzj deleted the support-scale-subresource-in-our-crds branch June 11, 2020 20:42

im-konge mentioned this pull request Jun 17, 2020

[systemtest] Add tests for scaling subresources #3209

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for scale subresource to Connect, S2I, MM1, MM2, Bridge and Connectors #3165

Add support for scale subresource to Connect, S2I, MM1, MM2, Bridge and Connectors #3165

scholzj commented Jun 6, 2020 •

edited

Loading

scholzj commented Jun 6, 2020

tombentley left a comment

ppatierno left a comment

samuel-hawker left a comment

scholzj commented Jun 9, 2020

ppatierno commented Jun 9, 2020

scholzj commented Jun 9, 2020

ppatierno commented Jun 9, 2020

scholzj commented Jun 9, 2020

scholzj commented Jun 10, 2020

scholzj commented Jun 10, 2020

azure-pipelines bot commented Jun 10, 2020

scholzj commented Jun 10, 2020

strimzi-ci commented Jun 11, 2020

scholzj commented Jun 11, 2020

azure-pipelines bot commented Jun 11, 2020

exherb commented Aug 2, 2020

scholzj commented Aug 2, 2020

Add support for scale subresource to Connect, S2I, MM1, MM2, Bridge and Connectors #3165

Add support for scale subresource to Connect, S2I, MM1, MM2, Bridge and Connectors #3165

Conversation

scholzj commented Jun 6, 2020 • edited Loading

Type of change

Description

Checklist

scholzj commented Jun 6, 2020

tombentley left a comment

Choose a reason for hiding this comment

ppatierno left a comment

Choose a reason for hiding this comment

samuel-hawker left a comment

Choose a reason for hiding this comment

scholzj commented Jun 9, 2020

ppatierno commented Jun 9, 2020

scholzj commented Jun 9, 2020

ppatierno commented Jun 9, 2020

scholzj commented Jun 9, 2020

scholzj commented Jun 10, 2020

scholzj commented Jun 10, 2020

azure-pipelines bot commented Jun 10, 2020

scholzj commented Jun 10, 2020

strimzi-ci commented Jun 11, 2020

❌ Test Summary ❌

❗ Test Failures ❗

scholzj commented Jun 11, 2020

azure-pipelines bot commented Jun 11, 2020

exherb commented Aug 2, 2020

scholzj commented Aug 2, 2020

scholzj commented Jun 6, 2020 •

edited

Loading