Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add complex scaling logic for custom formula & external scaling via grpc server #4583

Closed
wants to merge 21 commits into from

Conversation

gauron99
Copy link
Contributor

@gauron99 gauron99 commented May 29, 2023

UPDATE

Ive separated 2 functions in this PR into 2 and am keeping this as draft only for references as references

  • new feat: formula here
  • new feat: grpc servers as external calculations here

Summary

adds new structure in SO that describes how to modify/alter fetched metrics. User can describe new target to scale on within this structure. IF this is defined, new composite-scaler-metric is created and passed on to HPA (hpa.go). HPA will ask for only this one metric and internally all external metrics will be fetched and used in scale_handler.go.
2 new features within this struct are formula and external calculations (user-defined grpc server).

  • formula -> define formula with trigger names to modify metrics how ever you want using https://github.com/antonmedv/expr. Value returned from this will be attached to composite-scaler metric and returned as a final metric.
  • external calc -> user can defined grpc server and call it via url. New .proto file was created to match this. KEDA (as client) will try to connect to this server and pass its metrics via method Calculate.

one can chain multiple external calcs together when they define multiple. (its an array of grpc servers)

Order of execution: Fetch all external metrics from triggers -> external calcs execute in order (from top to bottom) as given in SO -> apply formula (you can use last name of grpc server if you want to manipulate its returned metric further)

Checklist

  • implement timeout & fallback for EC (e2e test)
  • create more test scenarios ( unit tests fallback)
  • create more test scenarios ( unit tests scale_handler)
  • admission webhook checkers
  • security for grpc connection
  • Changelog has been updated and is aligned with our changelog requirements
  • A PR is done to update the documentation on (docu-link)
  • Commits are signed with Developer Certificate of Origin (DCO - learn more)

Fixes #3567
Fixes #2440

controllers/keda/hpa.go Outdated Show resolved Hide resolved
pkg/scaling/scale_handler.go Outdated Show resolved Hide resolved
pkg/scaling/scale_handler.go Outdated Show resolved Hide resolved
@carlreid
Copy link

carlreid commented Jul 4, 2023

Hey, first of all, not to push you at all regarding the work on this PR. I am wondering what kind of priority these changes have, so I can see if I should wait for this to be released to use AND logic in KEDA, or look into some other potential solutions in the short term

@gauron99
Copy link
Contributor Author

gauron99 commented Jul 4, 2023

@carlreid hello, Im working on this now daily. All functions are implemented, working on tests and some other specifics now to comply with PR standards. Im expecting to finish in few days then add few days for reviews etc.

@carlreid
Copy link

carlreid commented Jul 4, 2023

Hey @gauron99, thanks for the update! Then it makes sense for me to pause working on changes my side and use the great work you're doing here 👍

pkg/scaling/scale_handler.go Fixed Show fixed Hide fixed
pkg/scaling/scale_handler.go Fixed Show fixed Hide fixed
@gauron99
Copy link
Contributor Author

/run-e2e external_scaling*

Copy link
Contributor Author

@gauron99 gauron99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for this big PR, shouldve separated grpc servers into its own PR. Ive made some self-reviews and what i thought were important points


// ComplexScalingLogic describes advanced scaling logic options like formula
// and gRPC server for external calculations
type ComplexScalingLogic struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a point from @JorTurFer kedacore/keda-docs#1189 (comment) that this structure could use a better name - something like modifiers in yaml file. In such case i think it'd be good to change this one as well

// that KEDA can connect to with collected metrics and modify them. Each server
// has a timeout and tls certification. If certDir is left empty, it will
// connect with insecure.NewCredentials()
type ExternalCalculation struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this struct as well

@@ -141,6 +166,10 @@ type ScaledObjectStatus struct {
// +optional
ResourceMetricNames []string `json:"resourceMetricNames,omitempty"`
// +optional
CompositeScalerName string `json:"compositeScalerName,omitempty"`
// +optional
ExternalCalculationHealth map[string]HealthStatus `json:"externalCalculationHealth,omitempty"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i created a new structure for grpc servers to have their own health status. Option number 2 could be to add them to already existing health status and prefix them with something like external-calculator or modifier depending on name of struct (see above). Or not prefix it at all and just keep it how it is but all with one health struct


// ValidateComplexScalingLogic validates all combinations of given arguments
// and their values
func ValidateComplexScalingLogic(so *ScaledObject, specs []autoscalingv2.MetricSpec) (float64, autoscalingv2.MetricTargetType, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not so sure these validation funcs should be in apis/keda/v1alpha/scaledobject_webhook.go but wasnt sure where to put them. there is dependency cycle if its in hpa.go (where the other call to this function is) because it imports kedav1alpha1


message Response {
MetricsList list = 1;
string error = 2;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

response possibly doesnt need error string because generated Calculate method is implemented with error return value by default

const externalCalculatorStr string = "externalcalculator"

// TODO: gauron99 - possible refactor this if trying to unify status updates & fallback functionality
func isFallbackEnabled(scaledObject *kedav1alpha1.ScaledObject, metricSpec v2.MetricSpec, determiner string) bool {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback functionality is also used to update status of health of metrics. I think this could be separated and refactored. I talked briefly with @zroubalik about this. Health status is updated multiple times (for each metric separate). I would propose locally changing the status via variable and only at the end of GetScaledObjectMetrics() in scale_handler.go it would be updated in SO in cluster. While its being manipulated with it would use the local variable. (If this is acceptable Id create an issue and like to implement this separately because this pr is too big already)

@@ -401,43 +451,55 @@ func (h *scaleHandler) ClearScalersCache(ctx context.Context, scalableObject int

// GetScaledObjectMetrics returns metrics for specified metric name for a ScaledObject identified by its name and namespace.
// It could either query the metric value directly from the scaler or from a cache, that's being stored for the scaler.
func (h *scaleHandler) GetScaledObjectMetrics(ctx context.Context, scaledObjectName, scaledObjectNamespace, metricName string) (*external_metrics.ExternalMetricValueList, error) {
func (h *scaleHandler) GetScaledObjectMetrics(ctx context.Context, scaledObjectName, scaledObjectNamespace, metricsName string) (*external_metrics.ExternalMetricValueList, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is renamed because when matched metric is found its assigned - metricName := spec.External.Metric.Name.

Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
…ne scaling logic for external calculator list

Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
… scale_handler test one

Signed-off-by: gauron99 <fridrich.david19@gmail.com>
…ation funcs

Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
…anup

Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
…dation

Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
Signed-off-by: gauron99 <fridrich.david19@gmail.com>
@gauron99 gauron99 marked this pull request as ready for review July 19, 2023 12:11
@gauron99 gauron99 requested a review from a team as a code owner July 19, 2023 12:11
Copy link
Member

@tomkerkhove tomkerkhove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would have been nice to have a design review on the proposed configuration/approach first, if I may be frank.

Given the big addition/difference, I'd like to review this in depth but am on holiday for 2.5w now so might be delayed.

This feature can be powerful, but needs to be accessible as well.

@gauron99
Copy link
Contributor Author

gauron99 commented Jul 20, 2023

I think it would have been nice to have a design review on the proposed configuration/approach first

I understand. I discussed this with Zbynek a bit. No need to worry about review speed and enjoy your holiday! I'm prepared to change whatever is necessary according to what you guys decide, I understand its a big change I really should've split the two. Thats my bad.

@gauron99 gauron99 marked this pull request as draft July 23, 2023 15:24
@gauron99
Copy link
Contributor Author

marked this PR draft to serve as a unity of 2 new functionalities but will separate into 2 PRs

@zroubalik
Copy link
Member

Closing in favor 2 separate PRs

@zroubalik zroubalik closed this Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide support for using AND logic for triggers Support custom formula for multiple metrics
4 participants