-
Notifications
You must be signed in to change notification settings - Fork 724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new operator flag to control Elasticsearch health observation intervals #5861
Conversation
I understand that we need to do something if the provided interval is negative but I'm not sure that we should disable the observer. What does it mean to manage an ES for ECK without an observer? Isnt it dangerous to reconcile ES without knowing the real health? |
I think you are right in general that disabling the observer is maybe a step too far. At least without compensating for it by having a synchronous observation in the reconcilation loop. I forgot that the synchronous observation only happens when the observer is first constructed which only happens on settings changes (e.g. certificate changed or similar) However I also think that the notion of "real health" is flawed. We are alway working with a health observation that is by default up 10 seconds old and if Elasticsearch is slow to respond potentially even older. So adjusting the observation interval just moves the needle on the staleness scale from at worst 10 seconds to maybe 10 hours stale. I am moving this back to draft mode to see if I can come up with a solution and also address the negative value issue for the annotation. |
@thbkrkr I have made it so that what I wrote in the OP is now true: when asynchronous observation is diabled a synchronous observation is made on each reconciliation. This stil has some drawbacks as the operator cannot react to changes in Elasticsearch health but at least each reconciliation is working with non-stale health data when it happens. But I think I am also open to going back to your idea of simply validating a positive interval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behaviour looks good to me.
I left some minor comments on names and constants.
…ervals (elastic#5861) Annotations on individual Elasticsearch resources take precedence to avoid breaking existing customisations for users. A non-positve value disables asynchronous observation completely. Only one synchronous observation happens during reconciliation. This disables also timely automatic pod disruption budget adjustment on health changes. As a side effect of client-go cache refreshes observations still happen every 10 hours due to reconciliation. This means disabling the asynchronous observation has the same effect as setting the observation interval to 10 hours.
Fixes #5839
Introduce new global flag to conrol Elasticsearch health observation interval.