-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Charm should be reinitialized at every hook execution in Harness #736
Comments
example of a horrible workaround I had to use: |
I believe it's a bit more nuanced than that, but please correct me if I'm wrong. My understanding is that if it is a custom event, then the call is made recursively without exiting, while hooks for proper Juju events would indeed exit between executions. |
Very true! We'd have to check if the event being triggered is a custom one or has been fired by charm code. Only on 3) we want to reinitialize the charm. If we are not in a live charm, we have to be more clever because there's no context telling us 'this is the event we are running', so a possible workaround would have to set some fancy event flags from within harness.emit(). |
Defer also sneaks its way into this. Deferred evens are run together on the same charm instance/setup as the event that comes in and triggers them. |
Another side-effect of this issue is: Harness never fires framework.on.commit |
@rwcarlsen I wrote a little POC wrapper for us to play with: https://github.com/PietroPasotti/harness-extensions#harness_ctx |
A good first run at the problem. I really think this is going to intersect with the event sequence testing work (e.g. #696). This idea of sandboxing a particular charm scenario then running it and doing assertions seems to be very integral with both these ideas. |
I'm working on ingress for prometheus now and this feature is painfully needed. |
@sed-i have you tried out the harness_ctx? |
What's that? (Didn't find mentions in the usual places.) |
https://github.com/PietroPasotti/harness-extensions#harness_ctx Afraid it wasn't really advertised, except for a comment on this issue. I'm working on its successor, but in the meanwhile, you could give this a go. |
To be honest, I'm not really sure what the intent of this is, but at first glance it seems like an antipattern
I really wouldn't want the act of relating A to B to cause A & C's relation to behave wildly differently. I understand the more general concern about the Harness and the fact that it doesn't simulate the full teardown and refresh that happens when actually processing hooks. But I'd be concerned if this was a motivating use case (unless I'm misunderstanding something) |
(at a minimum, relating A to B should not require that B knows the names of the endpoints of charm A. That is certainly an abstraction break) |
Because Oh, and see also Pietro's work on ops-scenario. |
This issue puts charm authors in an odd position:
I now have failing utests after refactoring and I suspect it's because of this issue. |
@sed-i I'm not sure I understand this:
This must mean that charms are storing state on their charm instance that would lead to "benefits of charm re-init". But isn't that a no-no? That is, charms should never be storing state in the charm instance, because it won't stay around anyway. What kind of state are you / others storing on the charm instance that would be affected by this. And if your hooks are expecting charm state to be re-inited on every hook, that means defer right now won't work as expected, right? Because currently we don't re-init the charm between running deferred events and the "real" event. I'd be interested to know what kind of state issue it is that's causing the unit test failures you're seeing. That said, I'm still open to considering changing |
Not sure what you mean by "storing state", but compare real charm and harness for
Correct, but thankfully we do not use
Avoiding needing to reason about it at all would be even better :D |
@benhoyt for additional context, we store an abstraction of the config, relation data etc on the charm during |
I'm afraid that would break a hell of a lot of code, but maybe it's a good thing. |
Also: * Add web_external_url as extra SAN * Don't need 'container_pebble_ready' after 'begin_with_initial_hooks' * Refactor with config builder * Refactor with workload manager * Drop StoredState * Skip some utests because of canonical/operator#736 --------- Co-authored-by: Luca Bello <luca.bello@canonical.com> Co-authored-by: Pietro Pasotti <starfire.daemon@gmail.com>
I think it would be great if we could come to some kind of conclusion on this so that we can move forward. It's currently a bit hampering that we have disparate behavior in Juju and in Harness. |
I've been doing some more thinking about this and discussing with various folks, and I think I understand the cases where this is problematic for charmers, or at the very least confusing. For example, with @jdkandersson's example, they're storing a snapshot of the current state of the charm in a dataclass in charm It would be significantly simpler to reason about this if ops did reinitialise the charm instance before every hook call. This would apply to deferred events and Juju events (but not custom events -- I'll comment on #952, but I don't think that's a good idea). And of course we'd then update the Harness to match this behaviour. I think this would be a backwards-compatible change, because charms shouldn't be relying on the charm instance not being new between events (again, normally it is a new instance, except for deferred events). So you might be able to come up with a contrived example that breaks, but it shouldn't break real charms, or if it does, that's probably exposing an actual bug in the charm. I'd like to talk this through with @jameinel next week to ensure I'm not missing some historical or other context (he has a lot of background here). Also, if @PietroPasotti or others have ideas as to how this should be implemented, I'm all ears. It's a bit tricky. Currently the charm is only instantiated once in |
Another facet of this issue is that in harness tests I need to manually clear collected statuses in between parts of the same test: self.harness.update_config(bad_cfg)
self.harness.evaluate_status()
self.assertIsInstance(self.harness.charm.unit.status, BlockedStatus)
# Without this, the BlockedStatus persists from before
self.harness.charm.unit._collected_statuses.clear() # <----- HERE
self.harness.update_config(good_cfg)
self.harness.evaluate_status()
self.assertIsInstance(self.harness.charm.unit.status, ActiveStatus) |
This ☝️ also means that custom events would never be able to remediate a status, because we only have the |
@sed-i As for your second comment, can you please give a more specific example of how this would happen? Custom events are fired "synchronously" during a hook execution, so by the time the collect-status event is executed after the hook is done, things will be in a stable state. |
Thanks @benhoyt!
That is correct if we only |
For reference, see canonical/alertmanager-k8s-operator#202 for an example of the kind of place this crops up. |
this came up in juju/juju-controller#56 as well |
@benhoyt is there any effort placed on this bug? This makes harness less reliable in terms of validating that the test results match real life expectations. We have to do these types of workarounds to avoid the issue: |
Hi @gruyaume this is on our work list for the current cycle, and we still expect it to be completed before Madrid. |
Awesome, that's great to know! cc @dariofaccin |
We can have AnalysisI tried three main variants of having a fresh charm instance for each (Juju) hook:
Running this version of Some of the issues are actually problems with the charm (for example, there's a charm that does I also tried one other approach, where I tracked the attributes set on the charm and raised an error if one was accessed that wasn't present during sdcore-upf-k8sThe most recent example in the ticket where this is a problem - and one where the tests are currently doing a charm reinitialisation, is score-upf-k8s. The tests all pass with ops 2.11. 9 (of 76) tests fail if the code that reinitialises the charm is removed. 6 tests still fail when using the code from above, because it's not cherry-picking when to reset like the existing code, it's always doing it. IssuesCharm tests sometimes do the
|
Thanks @tonyandrewmeyer for the thorough analysis and write-up. Let's discuss further in our 1:1 which of these recommendations we should go further with, if any, and go from there. One thing I think would be useful (in our chat or before) is to look at the recent comments from me, Pietro, and Guillaume pointing to specific issues, and see how you'd recommend they address those specific issues. |
Summary, after @benhoyt and I discussed this in detail:
|
More specifically, before every hook execution.
Point is: when a 'real' charm runs, it is reinitialized afresh.
While if you do fire eventA and eventB on the harness, the harness won't reinitialize the charm it holds in between, which may result in some major behavioural differences between a live charm and a harness'd charm.
Use cases include:
Concrete example of an affected charm:
https://github.com/canonical/prometheus-k8s-operator/blob/main/src/charm.py
The text was updated successfully, but these errors were encountered: