Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS SQS Scaler: Add Visible + NotVisible messages for scaling considerations #1664

Merged
merged 1 commit into from
Mar 15, 2021

Conversation

TyBrown
Copy link
Contributor

@TyBrown TyBrown commented Mar 11, 2021

This change affects the AWS SQS Scaler and how it considers when to scale down a ScaledObject. Before this change, the scaler only considered AproximateNumberOfMessages, which only includes pending queue messages.

After this change, the scaler will add ApproximateNumberOfMessages and ApproximateNumberOfMessagesNotVisible, the latter of which is all in-flight messages.

Checklist

  • Commits are signed with Developer Certificate of Origin (DCO - learn more)
  • [N/A] Tests have been added
  • [N/A] A PR is opened to update our Helm chart (repo) (if applicable, ie. when deployment manifests are modified)
  • A PR is opened to update the documentation on (repo) (if applicable)
  • Changelog has been updated

Fixes #1663

Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thanks for the fix!

Could you please update the Changelog as well? (Improvements section)

I wonder whether we should add a note to the docs about this behavior: https://keda.sh/docs/2.1/scalers/aws-sqs/#trigger-specification

WDYT @tomkerkhove ?

…rations

Signed-off-by: Ty Brown <tbrown@rewardsnetwork.com>
@TyBrown TyBrown force-pushed the sqs_queue_in_flight_fix branch from 388f6a2 to 8e766e3 Compare March 12, 2021 16:36
@TyBrown
Copy link
Contributor Author

TyBrown commented Mar 12, 2021

@zroubalik I've updated the CHANGELOG as requested. Happy to update docs as well if y'all want, just let me know.

@TyBrown TyBrown requested a review from zroubalik March 12, 2021 16:56
@tomkerkhove
Copy link
Member

Looking good, thanks for the fix!

Could you please update the Changelog as well? (Improvements section)

I wonder whether we should add a note to the docs about this behavior: keda.sh/docs/2.1/scalers/aws-sqs/#trigger-specification

WDYT @tomkerkhove ?

Would be good to just doc it indeed!

@TyBrown
Copy link
Contributor Author

TyBrown commented Mar 12, 2021

kedacore/keda-docs#398 is opened for Doc updates.

Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@zroubalik zroubalik merged commit 397ab81 into kedacore:main Mar 15, 2021
@TyBrown TyBrown deleted the sqs_queue_in_flight_fix branch March 15, 2021 15:02
@sdanzan
Copy link

sdanzan commented Mar 23, 2021

This should be documented as a breaking change. Counting the in flight messages as part of the queue length changes the behaviour and makes it less intuitive to select a proper scale up threshold in case of systems with a sustained throughput with occasional bursts. If my goal is to maintain a close to zero queue length at all time, previously it would be just a matter of selecting a random non zero high enough threshold to trigger a scale up. This threshold could be pretty much independent of the number of pods I need in a normal situation (and thus could be chosen equal across multiple similar systems), this is no longer the case with this change.

@yaronya
Copy link
Contributor

yaronya commented Jul 14, 2021

This should have been configurable IMHO.
For my use case, it might take some time for my pod to delete the message from the queue but in the meantime KEDA spawns another redundant pod because it also treats in-flight messages now.

I can work around it by changing the polling rate but that increases the delay for this whole batch, which isn't cool.

@pjaak
Copy link

pjaak commented Mar 28, 2022

I agree with the above. We have some messages which can take like 30mins + to process so therefore Keda adds unnecessary pods.

@LiamStorkey
Copy link
Contributor

I agree with the above. This now causes resource redundancy with pods being told to "stay up" while there is no work for them to do. Can we add in a configurable flag for this ?

@zroubalik
Copy link
Member

OK, I haven't been aware of such usecase. Anybody willing to contribute the configuration option?

@pjaak
Copy link

pjaak commented Mar 28, 2022

Happy to take a look with @LiamStorkey!

@pimbeenes
Copy link

@pjaak Did you ever get to implementing this ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AWS SQS Queue Scaler fails to account for in-flight messages, premature scale down
8 participants