Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for Adding Standard HPA resource metrics to KEDA #188

Merged
merged 13 commits into from
Jul 10, 2020
40 changes: 36 additions & 4 deletions content/docs/2.0/concepts/scaling-deployments.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,20 @@ spec:
scaleTargetRef:
deploymentName: {deployment-name} # must be in the same namespace as the ScaledObject
containerName: {container-name} #Optional. Default: deployment.spec.template.spec.containers[0]
pollingInterval: 30 # Optional. Default: 30 seconds
cooldownPeriod: 300 # Optional. Default: 300 seconds
minReplicaCount: 0 # Optional. Default: 0
maxReplicaCount: 100 # Optional. Default: 100
pollingInterval: 30 # Optional. Default: 30 seconds
cooldownPeriod: 300 # Optional. Default: 300 seconds
minReplicaCount: 0 # Optional. Default: 0
maxReplicaCount: 100 # Optional. Default: 100
advanced:
horizontalPodAutoscalerConfig: # Optional. If not set, KEDA won't scale based on resource utilization
resourceMetrics:
- name: cpu/memory # Name of the resource to be targeted
target:
type: value/ utilization/ averagevalue
value: 60 # Optional
averageValue: 40 # Optional
averageUtilization: 50 # Optional
behavior:
triggers:
# {list of triggers to activate the deployment}
```
Expand Down Expand Up @@ -93,6 +103,28 @@ Minimum number of replicas KEDA will scale the deployment down to. By default it

This setting is passed to the HPA definition that KEDA will create for a given deployment.

---

```yaml
advanced:
horizontalPodAutoscalerConfig:
resourceMetrics:
- name: cpu/memory
target:
type: value/ utilization/ averagevalue
value: 60 # Optional
averageValue: 40 # Optional
averageUtilization: 50 # Optional
behavior:
```

The above configuration can be used to scale deployments based on standard resource metrics like CPU / Memory. This configuration is the same as the standard HorizontalPodAutoscaler configuration. KEDA would feed this value as resource metric(s) into the HPA itself.
* name: This is the name of the resource to be targeted as a metric (cpu, memory etc)
* type: type represents whether the metric type is Utilization, Value, or AverageValue.
* value: value is the target value of the metric (as a quantity).
* averageValue: averageValue is the target value of the average of the metric across all relevant pods (quantity)
* averageUtilization: averageUtilization is the target value of the average of the resource metric across all relevant pods, represented as a percentage of the requested value of the resource for the pods. Currently only valid for Resource metric source type.

## Long-running executions

One important consideration to make is how this pattern can work with long running executions. Imagine a deployment triggers on a RabbitMQ queue message. Each message takes 3 hours to process. It's possible that if many queue messages arrive, KEDA will help drive scaling out to many replicas - let's say 4. Now the HPA makes a decision to scale down from 4 replicas to 2. There is no way to control which of the 2 replicas get terminated to scale down. That means the HPA may attempt to terminate a replica that is 2.9 hours into processing a 3 hour queue message.
Expand Down