Scalers don't retry if connection is lost #2415

pascallap · 2021-12-22T21:35:06Z

Report

If the scaler rabbitmq is installed or launched while rabbitmq isn't up.
There is no reconnection attempt.

We only get an error:
ERROR scalehandler error resolving auth params {"scalerIndex": 0, "object": {"apiVersion": "keda.sh/v1alpha1", "kind": "ScaledJob", "namespace": "anamespace", "name": "servicename"}, "trigger": 0, "error": "error establishing rabbitmq connection: dial tcp xxx.xxx.xx.xx:5672: i/o timeout"}

And if rabbitmq is available after that there is no reconnection attempt.
The only way to solve this is by restarting the keda-operator.

Expected Behavior

A reconnection retry.

Actual Behavior

No retry.

Steps to Reproduce the Problem

Install keda in namespace keda
Install rabbitmq in namespace default
configure a network policy on default namespace blocking all external traffic.
Install a scaler in default namespace
---- Failure --- Error in keda operator log
Delete the network policy to allow traffic.
---- Nothing ---- There is no reconnect from keda to rabbitmq.

Logs from KEDA operator

ERROR	scalehandler	error resolving auth params	{"scalerIndex": 0, "object": {"apiVersion": "keda.sh/v1alpha1", "kind": "ScaledJob", "namespace": "anamespace", "name": "servicename"}, "trigger": 0, "error": "error establishing rabbitmq connection: dial tcp xxx.xxx.xx.xx:5672: i/o timeout"}

KEDA Version

2.5.0

Kubernetes Version

1.19

Platform

Amazon Web Services

Scaler Details

RabbitMQ

Anything else?

This is a big problem, since we are running rabbitmq in cluster.
And we close the clusters each night.

Keda is too fast to restart, and rabbitmq is not properly up when keda tries to establish the connection.

The text was updated successfully, but these errors were encountered:

zroubalik · 2022-01-03T12:32:18Z

Thanks for opening this issue, it seems like that this bug has been introduced by caching mechanism implemented in 2.5.0 #2187 and it is not exclusive to RabbitMQ.

Lyrositor · 2022-01-03T12:37:40Z

I can confirm this, I have experienced this with the Redis Lists scaler as well.

Reverting to 2.4.0 fixes the issue.

JorTurFer · 2022-01-09T00:00:50Z

I think that this PR solves the problem

pascallap added the bug Something isn't working label Dec 22, 2021

zroubalik added this to the v2.6.0 milestone Jan 3, 2022

zroubalik changed the title ~~Rabbitmq Scaler doesn't retry if connection is lost~~ Scalers don't retry if connection is lost Jan 3, 2022

JorTurFer self-assigned this Jan 7, 2022

zroubalik mentioned this issue Jan 12, 2022

Metrics server suddenly stoped getting metrics from rabbitmq #2476

Closed

zroubalik closed this as completed Jan 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalers don't retry if connection is lost #2415

Scalers don't retry if connection is lost #2415

pascallap commented Dec 22, 2021

zroubalik commented Jan 3, 2022

Lyrositor commented Jan 3, 2022

JorTurFer commented Jan 9, 2022

Scalers don't retry if connection is lost #2415

Scalers don't retry if connection is lost #2415

Comments

pascallap commented Dec 22, 2021

Report

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Logs from KEDA operator

KEDA Version

Kubernetes Version

Platform

Scaler Details

Anything else?

zroubalik commented Jan 3, 2022

Lyrositor commented Jan 3, 2022

JorTurFer commented Jan 9, 2022