-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent the deferred events queue to contain duplicates #935
Comments
In my opinion, I'd love to guard against defer generally. If your
preconditions aren't met, you'd really rather have a way to wait for those
preconditions explicitly (a different event), rather than deferring *this*
event. I realize some preconditions can't be made into events (yet). But it
certainly falls into the "you shouldn't defer config-changed because of a
problem with config, because you'll get another config-changed when the
config actually changes".
But if config-changed needed relation-data, then you get into a pretty
sticky situation. Though for that example, why defer at all, vs just
waiting for the other one?
I don't see any reason that you should queue up an event more than once
(given a receiver and event pair). Making it an exception just means that
existing charms will start going into Error state (because they won't be
handling the error). Silently omitting the duplicate is a bit iffy, but
might be the best option.
I'm still of the mind that calling event.defer should *feel bad*, as it
indicates we are missing something in the model, rather than a good
practice.
…On Tue, May 23, 2023 at 9:16 AM PietroPasotti ***@***.***> wrote:
This is a bit of a large topic, but the broad question is: suppose my
charm has deferred a foo event, should we disallow (or enable people to
guard against) a situation where two or more foo deferred events are in
the queue?
A typical use case for event deferrals is a 'missing preconditions' case:
an event can't be processed because some preconditions are missing. If the
same event is triggered again before the preconditions are met, we now have
two copies of that event in the deferral queue. When at some point in the
future the precondition is eventually met and the queue flushes, we'll be
executing that handler multiple times quickly in succession, that may
result in the worst case into bugs (or downtime) such as when a flurry of
pebble service restarts hits a workload container.
Main options I see for addressing this, from more to less aggressive, are:
- raise an AlreadyDeferredError from Event.defer() if the same event
was already deferred (on the same handler).
- (or silently skip adding the duplicate event to the queue, but not a
big fan of that idea)
- allow the charm to inspect the deferral queue: the typical pattern
would be if not self.framework.has_deferred(event): event.defer()
Do we have a 'use case' for not guarding against this, i.e. for leaving
things as they are, and allow the same event to occur multiple times in a
deferral queue?
—
Reply to this email directly, view it on GitHub
<#935>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABRQ7IP4WIFBAADVOR7K7TXHS2CBANCNFSM6AAAAAAYL4JF7Q>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
At first glance what John is saying sounds good, however, @jameinel, wouldn't that approach also be nudging people to an "on any event, reconcile everything" approach to charming? Because you'd have to check for all preconditions on every event. Whereas with defer you get the specific event that was interrupted. In any case, it makes sense to me to avoid duplicates in the queue. I don't see why it would ever be a good thing to defer two of the same event (unless the charm is written in a very odd way). I think we should probably skip adding the deferred event, but log that it happened. It seems like this wouldn't be too hard to fix (though the @PietroPasotti Out of interest, did you run into this issue in "real life" because a charm was queueing up multiple duplicate events and that was causing problems? Or is it more of a "this could be an issue in theory" kind of thing? |
Per voice discussion with John, usually in cases where you'd defer, you need to check all the preconditions anyway. For example, if you need both relation data and config data to write out your workload's configuration file, you have to check for both of those things. |
...
@PietroPasotti <https://github.com/PietroPasotti> Out of interest, did
you run into this issue in "real life" because a charm was queueing up
multiple duplicate events and that was causing problems? Or is it more of a
"this could be an issue in theory" kind of thing?
I know that early on people would do things like "my config isn't correct,
let me defer()" and they would end up with 20 deferred config events.
|
Someone came to me with this piece of code: def _on_config_changed(self, event) -> None:
<do something>
if self.controller_container.can_connect():
self._restart_controller_service()
else:
self.unit.status = MaintenanceStatus("Waiting for kserve-controller to be reachable")
event.defer() and I saw a risk that if the config were to change multiple times before pebble was ready, we might queue multiple deferred config-changed and end up with a flurry of container service restarts as soon as the container can connect. |
I agree with @jameinel.
|
We could add checking for this and eliminate the duplicate event, but I'm not sure it's worth it, given we're planning to re-evaluate or even deprecate |
We didn't end up deprecating defer, and this wasn't resolved in #1205, so re-opening this. This would be done internally by, when |
The complicated part of this is that calling This is unchanged from the "initial import" commit. It seems like this is done for consistency, rather than to protect against the charm crashing and losing an event (Juju will re-emit the event in that case and since we don't commit the storage it's not saved anyway). We could skip storing the notice and snapshot if there's already one in storage - but we'll need to do that when the following Juju event comes along, rather than on Alternatively, we could do a more significant adjustment; for example, having a "happy path" that when there is no |
@tonyandrewmeyer is close to making the draft PR ready for review. |
This is a bit of a large topic, but the broad question is: suppose my charm has deferred a
foo
event, should we disallow (or enable people to guard against) a situation where two or morefoo
deferred events are in the queue?A typical use case for event deferrals is a 'missing preconditions' case: an event can't be processed because some preconditions are missing. If the same event is triggered again before the preconditions are met, we now have two copies of that event in the deferral queue. When at some point in the future the precondition is eventually met and the queue flushes, we'll be executing that handler multiple times quickly in succession, that may result in the worst case into bugs (or downtime) such as when a flurry of pebble service restarts hits a workload container.
Main options I see for addressing this, from more to less aggressive, are:
AlreadyDeferredError
fromEvent.defer()
if the same event was already deferred (on the same handler).if not self.framework.has_deferred(event): event.defer()
Do we have a 'use case' for not guarding against this, i.e. for leaving things as they are, and allow the same event to occur multiple times in a deferral queue?
The text was updated successfully, but these errors were encountered: