Service Bus Scaler has issues with the minReplicaCount parameter #4541

eugen-nw · 2023-05-15T16:19:31Z

Report

This parameter is documented as below. Yet it does not seem to be operational.

The min number of jobs that is created by default. This can be useful to avoid bootstrapping time of new jobs. If minReplicaCount is greater than maxReplicaCount, minReplicaCount will be set to maxReplicaCount.

Expected Behavior

If I set it to 2 for example, there will be 2 Jobs always available.
If I set it in the script the Jobs scaling will work. For the example above, if the count of Messages in the Queue exceeds 2, new Jobs will be started.

Actual Behavior

Setting minReplicaCount: 2 in the Job's script, after having deployed it I've noticed:

No new Containers were created automatically. I were expecting to see 2 of them.
Sending Messages to the Queue did not create any Containers. I'd have expected to see one Container being created for each Message I sent to the Queue.

Steps to Reproduce the Problem

Below is the YAML script that I'm experimenting with:

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: aks-aci-genericdev-runner-linux
  labels:
    app: aks-aci-genericdev-runner-linux
    deploymentName: aks-aci-boldiq-genericdev-runner-linux
spec:
  jobTargetRef:
    template:
      spec:
        containers:  # this section is identical as for a "kind: Deployment"
        - name: boldiq-genericdev-runner-linux
          image: <removed>
          imagePullPolicy: Always
          resources:
            requests:
              memory: 6G
              cpu: 2
            limits:
              memory: 6G
              cpu: 2
          env:
          - name: KEDA_SERVICEBUS_CONNECTIONSTRING_LINUX
            value: <removed>
        tolerations:
          - key: virtual-kubelet.io/provider
            operator: Exists
          - key: azure.com/aci
            effect: NoSchedule
        imagePullSecrets:
          - name: docker-registry-secret-linux
        nodeName: virtual-node-aci-linux
  pollingInterval: 1  # 1 second polling for max. responsiveness
  minReplicaCount: 2  # keeping two instances running permanently in order to improve low loads' performance
  triggers:
  - type: azure-servicebus
#    metricType: Value // The default AverageValue with messageCount: '1' starts up a new Container for each Message in the Queue.  We want that for responsiveness.
    metadata:
      queueName: requests
      connectionFromEnv: KEDA_SERVICEBUS_CONNECTIONSTRING_LINUX
      messageCount: '1'

AKS 1.25.6
KEDA 2.10.2
The Containers run on the virtual-node-aci-linux virtual node.

Logs from KEDA operator

example

KEDA Version

2.10.1

Kubernetes Version

1.25

Platform

Microsoft Azure

Scaler Details

Azure Service Bus

Anything else?

No response

The text was updated successfully, but these errors were encountered:

JorTurFer · 2023-05-17T09:06:29Z

Hello,
Could you share keda-operator logs in debug please? You can set it using helm: https://github.com/kedacore/charts/blob/main/keda/values.yaml#L198

In debug, you will see the value of the metric on each iteration, current active jobs count and required jobs

eugen-nw · 2023-05-18T21:55:20Z

Let's hope I did not do something too silly. Please correct my course if necessary.

I ran this command:
helm upgrade -f https://github.com/kedacore/charts/blob/main/keda/values.yaml#L198 keda kedacore/keda --namespace keda

And got this error:

Error: failed to parse https://github.com/kedacore/charts/blob/main/keda/values.yaml#L198: error converting YAML to JSON: yaml: line 28: mapping values are not allowed in this context

JorTurFer · 2023-05-18T22:08:00Z

I didn't know that your command was possible O.O
try with helm upgrade --set logging.operator.level=debug keda kedacore/keda --namespace keda

eugen-nw · 2023-05-18T22:21:50Z

Thank you. Neither did I, I'm just learning Helm as I go. The --set command completed successfully.

However, that value was already present in the output of the helm get all keda -n keda > keda.yaml command:

It appears that the keda-operator-.... pod is non-functional. Earlier I've seen this, now it has the CrashLoopBackOff status:

I ran kubectl logs -n keda keda-operator-db56bccc8-vnjqw, the output is below:

2023-05-18T22:25:44Z    INFO    controller-runtime.metrics      Metrics server is starting to listen    {"addr": ":8080"}
2023-05-18T22:25:44Z    DEBUG   setup   setting up cert rotation
2023-05-18T22:25:44Z    INFO    setup   Starting manager
2023-05-18T22:25:44Z    INFO    setup   KEDA Version: 2.10.1
2023-05-18T22:25:44Z    INFO    setup   Git Commit: 8adb70e97a08a4690613eef4c4f00391e44e1603
2023-05-18T22:25:44Z    INFO    setup   Go Version: go1.19.7
2023-05-18T22:25:44Z    INFO    setup   Go OS/Arch: linux/amd64
2023-05-18T22:25:44Z    INFO    setup   Running on Kubernetes 1.25      {"version": "v1.25.6"}
I0518 22:25:44.204296       1 leaderelection.go:248] attempting to acquire leader lease keda/operator.keda.sh...
2023-05-18T22:25:44Z    INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
2023-05-18T22:25:44Z    INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"}
I0518 22:26:01.302501       1 leaderelection.go:258] successfully acquired lease keda/operator.keda.sh
2023-05-18T22:26:01Z    DEBUG   events  keda-operator-db56bccc8-vnjqw_12f276b8-89fe-441d-a569-c0521635f640 became leader        {"type": "Normal", "object": {"kind":"Lease","namespace":"keda","name":"operator.keda.sh","uid":"206ed27f-3b4f-4ebf-8c15-d53ab82cc0c0","apiVersion":"coordination.k8s.io/v1","resourceVersion":"30945"}, "reason": "LeaderElection"}
2023-05-18T22:26:01Z    INFO    Starting EventSource    {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "source": "kind source: *v1alpha1.ScaledObject"}
2023-05-18T22:26:01Z    INFO    Starting EventSource    {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "source": "kind source: *v2.HorizontalPodAutoscaler"}
2023-05-18T22:26:01Z    INFO    Starting Controller     {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject"}
2023-05-18T22:26:01Z    INFO    Starting EventSource    {"controller": "triggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "TriggerAuthentication", "source": "kind source: *v1alpha1.TriggerAuthentication"}
2023-05-18T22:26:01Z    INFO    Starting Controller     {"controller": "triggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "TriggerAuthentication"}
2023-05-18T22:26:01Z    INFO    Starting EventSource    {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob", "source": "kind source: *v1alpha1.ScaledJob"}
2023-05-18T22:26:01Z    INFO    Starting Controller     {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob"}
2023-05-18T22:26:01Z    INFO    cert-rotation   starting cert rotator controller
2023-05-18T22:26:01Z    INFO    Starting EventSource    {"controller": "clustertriggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "ClusterTriggerAuthentication", "source": "kind source: *v1alpha1.ClusterTriggerAuthentication"}
2023-05-18T22:26:01Z    INFO    Starting Controller     {"controller": "clustertriggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "ClusterTriggerAuthentication"}
2023-05-18T22:26:01Z    INFO    Starting EventSource    {"controller": "cert-rotator", "source": "kind source: *v1.Secret"}
2023-05-18T22:26:01Z    INFO    Starting EventSource    {"controller": "cert-rotator", "source": "kind source: *unstructured.Unstructured"}
2023-05-18T22:26:01Z    INFO    Starting EventSource    {"controller": "cert-rotator", "source": "kind source: *unstructured.Unstructured"}
2023-05-18T22:26:01Z    INFO    Starting Controller     {"controller": "cert-rotator"}
2023-05-18T22:26:01Z    INFO    Starting workers        {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob", "worker count": 1}
2023-05-18T22:26:01Z    INFO    Reconciling ScaledJob   {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob", "ScaledJob": {"name":"aks-aci-genericdev-runner-linux","namespace":"default"}, "namespace": "default", "name": "aks-aci-genericdev-runner-linux", "reconcileID": "09fae947-d2f4-4aa8-b0bf-85478d97e2c7"}
2023-05-18T22:26:01Z    INFO    cert-rotation   no cert refresh needed
2023-05-18T22:26:01Z    INFO    Starting workers        {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "worker count": 5}
2023-05-18T22:26:01Z    INFO    cert-rotation   certs are ready in /certs
2023-05-18T22:26:01Z    INFO    Starting workers        {"controller": "cert-rotator", "worker count": 1}
2023-05-18T22:26:01Z    INFO    cert-rotation   Ensuring CA cert        {"name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration", "name": "keda-admission", "gvk": "admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration"}
2023-05-18T22:26:01Z    INFO    cert-rotation   Ensuring CA cert        {"name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService", "name": "v1beta1.external.metrics.k8s.io", "gvk": "apiregistration.k8s.io/v1, Kind=APIService"}
2023-05-18T22:26:01Z    DEBUG   Starting a new ScaleLoop        {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob", "ScaledJob": {"name":"aks-aci-genericdev-runner-linux","namespace":"default"}, "namespace": "default", "name": "aks-aci-genericdev-runner-linux", "reconcileID": "09fae947-d2f4-4aa8-b0bf-85478d97e2c7"}
2023-05-18T22:26:01Z    INFO    Starting workers        {"controller": "triggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "TriggerAuthentication", "worker count": 1}
2023-05-18T22:26:01Z    INFO    Starting workers        {"controller": "clustertriggerauthentication", "controllerGroup": "keda.sh", "controllerKind": "ClusterTriggerAuthentication", "worker count": 1}
2023-05-18T22:26:01Z    INFO    Initializing Scaling logic according to ScaledJob Specification {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob", "ScaledJob": {"name":"aks-aci-genericdev-runner-linux","namespace":"default"}, "namespace": "default", "name": "aks-aci-genericdev-runner-linux", "reconcileID": "09fae947-d2f4-4aa8-b0bf-85478d97e2c7"}
2023-05-18T22:26:01Z    DEBUG   ScaledJob is defined correctly and is ready to scaling  {"controller": "scaledjob", "controllerGroup": "keda.sh", "controllerKind": "ScaledJob", "ScaledJob": {"name":"aks-aci-genericdev-runner-linux","namespace":"default"}, "namespace": "default", "name": "aks-aci-genericdev-runner-linux", "reconcileID": "09fae947-d2f4-4aa8-b0bf-85478d97e2c7"}
2023-05-18T22:26:01Z    DEBUG   scale_handler   Watching with pollingInterval   {"type": "ScaledJob", "namespace": "default", "name": "aks-aci-genericdev-runner-linux", "PollingInterval": "1s"}
2023-05-18T22:26:01Z    DEBUG   events  Started scalers watch   {"type": "Normal", "object": {"kind":"ScaledJob","namespace":"default","name":"aks-aci-genericdev-runner-linux","uid":"1ba89476-1c93-4fc6-ae6e-da9fed530dbf","apiVersion":"keda.sh/v1alpha1","resourceVersion":"29896"}, "reason": "KEDAScalersStarted"}
2023-05-18T22:26:01Z    DEBUG   scalers_cache   Scaler Metric value     {"ScaledJob": "aks-aci-genericdev-runner-linux", "Scaler": "cache.ScalerBuilder:", "isTriggerActive": true, "s0-azure-servicebus-requests": 1, "targetAverageValue": 1}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2f31b34]

goroutine 420 [running]:
github.com/kedacore/keda/v2/apis/keda/v1alpha1.ScaledJob.MinReplicaCount(...)
        /workspace/apis/keda/v1alpha1/scaledjob_types.go:126
github.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).IsScaledJobActive(0x0?, {0x434cea8, 0xc000ce0080}, 0xc000029800)
        /workspace/pkg/scaling/cache/scalers_cache.go:196 +0x1f4
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers(0xc00064f880, {0x434cea8, 0xc000ce0080}, {0x3a84400?, 0xc000029800?}, {0x4338d38, 0xc001307600})
        /workspace/pkg/scaling/scale_handler.go:253 +0x9fa
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).startScaleLoop(0xc000ce5800?, {0x434cea8, 0xc000ce0080}, 0xc0004e6b40, {0x3a84400, 0xc000029800}, {0x4338d38, 0xc001307600})
        /workspace/pkg/scaling/scale_handler.go:167 +0x351
created by github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).HandleScalableObject
        /workspace/pkg/scaling/scale_handler.go:125 +0x7ed

eugen-nw · 2023-05-18T22:47:10Z

Line 126 of the keda/apis/keda/v1alpha1/scaledjob_types.go file is below:

I've no GoLang programming experience but realized that exception occurred because I did set minReplicaCount in the deployment .YAML but omitted setting maxReplicaCount, assuming that it will default to 100. Apparently it defaulted to nill. I fixed the deployment .YAML and the Job works now properly.

May I suggest initializing s.Spec.Min / MaxReplicaCount to their 0 and 100 default values if not present in the deployment .YAML? That will eliminate some nil checks in the code.

eugen-nw · 2023-05-18T23:15:54Z

Now I'm experiencing a different scale-out behavior that's a bit unexpected. As per the script below, I already have 2 Jobs running. If I send 3 Messages to the Queue, I get 3 new Jobs started, for a total of 5 Containers. I'd expect to get only 1 new Job started as I already have 2 of them running.

eugen-nw · 2023-05-19T17:14:24Z

I checked the documentation at https://keda.sh/docs/2.9/concepts/scaling-jobs/ and my scaling expectations just above are correct. This later issue is tracked by #4554

JorTurFer · 2023-05-22T19:06:52Z

Line 126 of the keda/apis/keda/v1alpha1/scaledjob_types.go file is below:

I've no GoLang programming experience but realized that exception occurred because I did set minReplicaCount in the deployment .YAML but omitted setting maxReplicaCount, assuming that it will default to 100. Apparently it defaulted to nill. I fixed the deployment .YAML and the Job works now properly.

May I suggest initializing s.Spec.Min / MaxReplicaCount to their 0 and 100 default values if not present in the deployment .YAML? That will eliminate some nil checks in the code.

Nice catch!

JorTurFer · 2023-05-22T19:17:19Z

I have opened an PR with the fix: https://github.com/kedacore/keda/pull/4565/files

JorTurFer · 2023-05-22T19:17:54Z

is this issue duplicated with #4554?

eugen-nw · 2023-05-24T00:44:01Z

This is not a duplicate of #4554

JorTurFer · 2023-05-26T17:16:48Z

Could you share the logs having both parameters (min and max) set?
The last time you shared the logs but there was a panic (btw, the fix is already merged) and the information that I want to check is missing

eugen-nw · 2023-05-26T18:17:48Z

I'll get to this on Tuesday, May 30th. We have a long weekend in the USA.

eugen-nw · 2023-05-30T19:35:25Z

This is fixed now, right? I should close then. And it's not a duplicate of #4554

JorTurFer · 2023-05-30T19:39:24Z

No no, the fix merged is for not crashing if you set minReplicaCount without setting maxReplicaCount. Was your problem related with that?

eugen-nw · 2023-05-31T00:06:02Z

Yes, the problem I had was that I was setting the minReplicaCount, not setting the maxReplicaCount and scale-out was no longer working because of the null dereference crash. I saw that fix, thanks very much. So this problem is completely addressed.

eugen-nw added the bug Something isn't working label May 15, 2023

keda-automation added this to Roadmap - KEDA Core May 15, 2023

github-project-automation bot moved this to To Triage in Roadmap - KEDA Core May 15, 2023

zroubalik mentioned this issue May 23, 2023

ScaledJob: panic ifminReplicaCount is specified & maxReplicaCount is not set #4568

Closed

JorTurFer moved this from To Triage to To Do in Roadmap - KEDA Core May 24, 2023

JorTurFer mentioned this issue May 26, 2023

Job-based Service Bus Scaler scales to too many instances #4554

Closed

eugen-nw closed this as completed May 30, 2023

github-project-automation bot moved this from To Do to Ready To Ship in Roadmap - KEDA Core May 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Service Bus Scaler has issues with the minReplicaCount parameter #4541

Service Bus Scaler has issues with the minReplicaCount parameter #4541

eugen-nw commented May 15, 2023

JorTurFer commented May 17, 2023

eugen-nw commented May 18, 2023

JorTurFer commented May 18, 2023

eugen-nw commented May 18, 2023 •

edited

Loading

eugen-nw commented May 18, 2023 •

edited

Loading

eugen-nw commented May 18, 2023

eugen-nw commented May 19, 2023 •

edited

Loading

JorTurFer commented May 22, 2023

JorTurFer commented May 22, 2023

JorTurFer commented May 22, 2023

eugen-nw commented May 24, 2023

JorTurFer commented May 26, 2023

eugen-nw commented May 26, 2023

eugen-nw commented May 30, 2023

JorTurFer commented May 30, 2023

eugen-nw commented May 31, 2023

Service Bus Scaler has issues with the minReplicaCount parameter #4541

Service Bus Scaler has issues with the minReplicaCount parameter #4541

Comments

eugen-nw commented May 15, 2023

Report

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Logs from KEDA operator

KEDA Version

Kubernetes Version

Platform

Scaler Details

Anything else?

JorTurFer commented May 17, 2023

eugen-nw commented May 18, 2023

JorTurFer commented May 18, 2023

eugen-nw commented May 18, 2023 • edited Loading

eugen-nw commented May 18, 2023 • edited Loading

eugen-nw commented May 18, 2023

eugen-nw commented May 19, 2023 • edited Loading

JorTurFer commented May 22, 2023

JorTurFer commented May 22, 2023

JorTurFer commented May 22, 2023

eugen-nw commented May 24, 2023

JorTurFer commented May 26, 2023

eugen-nw commented May 26, 2023

eugen-nw commented May 30, 2023

JorTurFer commented May 30, 2023

eugen-nw commented May 31, 2023

eugen-nw commented May 18, 2023 •

edited

Loading

eugen-nw commented May 18, 2023 •

edited

Loading

eugen-nw commented May 19, 2023 •

edited

Loading