Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Charm should be reinitialized at every hook execution in Harness #736

Closed
PietroPasotti opened this issue Apr 6, 2022 · 38 comments
Closed
Assignees
Labels
needs design needs more thought or spec

Comments

@PietroPasotti
Copy link
Contributor

More specifically, before every hook execution.
Point is: when a 'real' charm runs, it is reinitialized afresh.
While if you do fire eventA and eventB on the harness, the harness won't reinitialize the charm it holds in between, which may result in some major behavioural differences between a live charm and a harness'd charm.

Use cases include:

class Charm(CharmBase):
     def __init__(self):
         self.relation1 = relation1 = RelationRequirer()
         self.relation2 = RelationProvider(relation1.data[self.unit])

class Charm(CharmBase):
     def __init__(self):
         if self.container.can_connect():
             self.framework.observe(self.on.foo, self._foo)

Concrete example of an affected charm:
https://github.com/canonical/prometheus-k8s-operator/blob/main/src/charm.py

@PietroPasotti
Copy link
Contributor Author

@simskij
Copy link
Member

simskij commented Apr 14, 2022

I believe it's a bit more nuanced than that, but please correct me if I'm wrong. My understanding is that if it is a custom event, then the call is made recursively without exiting, while hooks for proper Juju events would indeed exit between executions.

@PietroPasotti
Copy link
Contributor Author

Very true! We'd have to check if the event being triggered is a custom one or has been fired by charm code.
In 'real life' we could discern between 1) events that are being resurrected (because they had been deferred in the past), 2) events that are 'new' and being manually fired by charm code and 3) 'the event' that is causing this execution of the charm.

Only on 3) we want to reinitialize the charm.

If we are not in a live charm, we have to be more clever because there's no context telling us 'this is the event we are running', so a possible workaround would have to set some fancy event flags from within harness.emit().

@rwcarlsen
Copy link
Contributor

Defer also sneaks its way into this. Deferred evens are run together on the same charm instance/setup as the event that comes in and triggers them.

@PietroPasotti
Copy link
Contributor Author

Another side-effect of this issue is: Harness never fires framework.on.commit
(advanced) features that depend on that hook to fire after every (other) hook execution are not testable with Harness.

@PietroPasotti
Copy link
Contributor Author

@rwcarlsen I wrote a little POC wrapper for us to play with: https://github.com/PietroPasotti/harness-extensions#harness_ctx

@rwcarlsen
Copy link
Contributor

A good first run at the problem. I really think this is going to intersect with the event sequence testing work (e.g. #696). This idea of sandboxing a particular charm scenario then running it and doing assertions seems to be very integral with both these ideas.

@sed-i
Copy link
Contributor

sed-i commented Sep 20, 2022

I'm working on ingress for prometheus now and this feature is painfully needed.

@PietroPasotti
Copy link
Contributor Author

@sed-i have you tried out the harness_ctx?

@sed-i
Copy link
Contributor

sed-i commented Sep 21, 2022

@sed-i have you tried out the harness_ctx?

What's that? (Didn't find mentions in the usual places.)

@PietroPasotti
Copy link
Contributor Author

What's that? (Didn't find mentions in the usual places.)

https://github.com/PietroPasotti/harness-extensions#harness_ctx

Afraid it wasn't really advertised, except for a comment on this issue. I'm working on its successor, but in the meanwhile, you could give this a go.

@jameinel
Copy link
Member

To be honest, I'm not really sure what the intent of this is, but at first glance it seems like an antipattern

         self.relation1 = relation1 = RelationRequirer()
         self.relation2 = RelationProvider(relation1.data[self.unit])

I really wouldn't want the act of relating A to B to cause A & C's relation to behave wildly differently.
At least, as I read this, this says that this charm has multiple possible relation endpoints, and it only responds on a given (related) endpoint when a different related endpoint receives the name of that endpoint on a different relation.

I understand the more general concern about the Harness and the fact that it doesn't simulate the full teardown and refresh that happens when actually processing hooks. But I'd be concerned if this was a motivating use case (unless I'm misunderstanding something)

@jameinel
Copy link
Member

(at a minimum, relating A to B should not require that B knows the names of the endpoints of charm A. That is certainly an abstraction break)

@benhoyt
Copy link
Collaborator

benhoyt commented May 1, 2023

Because defer means events may actually get run on the same charm instance, I'm not sure about just making a change to Harness here. We could change ops to always re-initialize the charm between running deferred events and between deferred events and the main event. Or we could go all in on "the charm instance may be the same; don't rely on this". In any case, leaving open for now.

Oh, and see also Pietro's work on ops-scenario.

@sed-i
Copy link
Contributor

sed-i commented Jun 28, 2023

This issue puts charm authors in an odd position:

  • If they use harness for development (i.e. before going to manual tests) then they get incorrect feedback about how their operator logic operates, because under harness they cannot enjoy the benefits of charm re-init. This leads to overcomplicated code.
  • If they refactor existing code, they may get false positives from harness and find themselves at a crossroads:
    • ihntroduce some ugly workaround in unit tests that behaves similar to re-init?
    • introduce unnecessary update methods in charm code so harness picks up the changes?
    • delete the test?
    • migrate to scenario tests?

I now have failing utests after refactoring and I suspect it's because of this issue.

@benhoyt
Copy link
Collaborator

benhoyt commented Jun 28, 2023

@sed-i I'm not sure I understand this:

because under harness they cannot enjoy the benefits of charm re-init. This leads to overcomplicated code.

This must mean that charms are storing state on their charm instance that would lead to "benefits of charm re-init". But isn't that a no-no? That is, charms should never be storing state in the charm instance, because it won't stay around anyway. What kind of state are you / others storing on the charm instance that would be affected by this.

And if your hooks are expecting charm state to be re-inited on every hook, that means defer right now won't work as expected, right? Because currently we don't re-init the charm between running deferred events and the "real" event.

I'd be interested to know what kind of state issue it is that's causing the unit test failures you're seeing.

That said, I'm still open to considering changing ops (and Harness) to reinitialize the charm on every external event, to make things like this easier to avoid and reason about. I wonder if there are any pitfalls / backwards compatibility concerns of doing that.

@sed-i
Copy link
Contributor

sed-i commented Jun 28, 2023

I'm not sure I understand this:

because under harness they cannot enjoy the benefits of charm re-init. This leads to overcomplicated code.

This must mean that charms are storing state on their charm instance

Not sure what you mean by "storing state", but compare real charm and harness for juju relate:

Juju Harness
Instruction juju relate us:rel them self.harness.add_relation("rel", "them")
Relation data updated Yes Yes
Charm re-init? Yes No
Helper objects reconstructed Yes No

And if your hooks are expecting charm state to be re-inited on every hook, that means defer right now won't work as expected, right? Because currently we don't re-init the charm between running deferred events and the "real" event.

Correct, but thankfully we do not use defer.
That being said, processing custom events also does NOT reinit the charm, which is a bit inconvenient.

I'm still open to considering changing ops (and Harness) to reinitialize the charm on every external event, to make things like this easier to avoid and reason about.

Avoiding needing to reason about it at all would be even better :D
But I'm surprised you mention "every external event". Afaiu, ops in fact does reinit the charm on every juju core event.

@jdkandersson
Copy link

@benhoyt for additional context, we store an abstraction of the config, relation data etc on the charm during __init__. So if an event gets deferred and there are config changes that occur without the charm being re-initialised then we could have an old copy of the state. It would be good to be consistent with whether the charm is re-initialised and make that a guarantee devs can rely on. I.e., there shouldn't be a difference with how the event is triggered that makes the charm be re-initialised or not. And test behaviour should definitely match production behaviour as closely as possible

@PietroPasotti
Copy link
Contributor Author

in fact does reinit the charm on every juju core event

ops inits the charm once, on main.main(CharmType), i.e. reinits the charm on every juju event.
indeed it's an option to move the charm initialization logic to Framework.reemit() so that a fresh charm instance processes each and every event (juju or custom).

I'm afraid that would break a hell of a lot of code, but maybe it's a good thing.

sed-i added a commit to canonical/alertmanager-k8s-operator that referenced this issue Jul 4, 2023
sed-i added a commit to canonical/alertmanager-k8s-operator that referenced this issue Jul 5, 2023
Also:
* Add web_external_url as extra SAN
* Don't need 'container_pebble_ready' after 'begin_with_initial_hooks'
* Refactor with config builder
* Refactor with workload manager
* Drop StoredState
* Skip some utests because of canonical/operator#736

---------

Co-authored-by: Luca Bello <luca.bello@canonical.com>
Co-authored-by: Pietro Pasotti <starfire.daemon@gmail.com>
@simskij
Copy link
Member

simskij commented Jul 6, 2023

@benhoyt

I think it would be great if we could come to some kind of conclusion on this so that we can move forward. It's currently a bit hampering that we have disparate behavior in Juju and in Harness.

@benhoyt
Copy link
Collaborator

benhoyt commented Jul 7, 2023

I've been doing some more thinking about this and discussing with various folks, and I think I understand the cases where this is problematic for charmers, or at the very least confusing.

For example, with @jdkandersson's example, they're storing a snapshot of the current state of the charm in a dataclass in charm __init__, and expecting that to be reinitialised between every hook. I think that expectation is fair, because normally this is the case, that is, when Juju fires the hook. However, when an event is deferred by ops and re-emitted at the next Juju event, the charm instance isn't reinitialised between running the previously-deferred event and the incoming event, so if the deferred event modified config (say), that won't be reflected in the charm.state dataclass instance.

It would be significantly simpler to reason about this if ops did reinitialise the charm instance before every hook call. This would apply to deferred events and Juju events (but not custom events -- I'll comment on #952, but I don't think that's a good idea). And of course we'd then update the Harness to match this behaviour.

I think this would be a backwards-compatible change, because charms shouldn't be relying on the charm instance not being new between events (again, normally it is a new instance, except for deferred events). So you might be able to come up with a contrived example that breaks, but it shouldn't break real charms, or if it does, that's probably exposing an actual bug in the charm.

I'd like to talk this through with @jameinel next week to ensure I'm not missing some historical or other context (he has a lot of background here).

Also, if @PietroPasotti or others have ideas as to how this should be implemented, I'm all ears. It's a bit tricky. Currently the charm is only instantiated once in ops/main.py, so maybe it would mean saving a reference to the charm on the framework, and then manually calling framework.charm.__init__(framework) before every event that's not a custom event.

@benhoyt benhoyt added the needs design needs more thought or spec label Oct 4, 2023
@sed-i
Copy link
Contributor

sed-i commented Oct 13, 2023

Another facet of this issue is that in harness tests I need to manually clear collected statuses in between parts of the same test:

self.harness.update_config(bad_cfg)
self.harness.evaluate_status()
self.assertIsInstance(self.harness.charm.unit.status, BlockedStatus)

# Without this, the BlockedStatus persists from before
self.harness.charm.unit._collected_statuses.clear()  # <----- HERE

self.harness.update_config(good_cfg)
self.harness.evaluate_status()
self.assertIsInstance(self.harness.charm.unit.status, ActiveStatus)

@sed-i
Copy link
Contributor

sed-i commented Oct 13, 2023

This ☝️ also means that custom events would never be able to remediate a status, because we only have the add_status interface.
For example, if an earlier func adds a status, which could later on be resolved by a custom event (e.g. ingress lib emits "ingress ready"), then there is no way to clear the previous status, except waiting for the next core hook.

@benhoyt
Copy link
Collaborator

benhoyt commented Oct 15, 2023

@sed-i I've tried to address the Harness.evaluate_status issue in #1048. I think that's just a bug in the implementation and it should have always cleared the previous collected statuses. Let me know what you think.

@benhoyt
Copy link
Collaborator

benhoyt commented Oct 15, 2023

@sed-i As for your second comment, can you please give a more specific example of how this would happen? Custom events are fired "synchronously" during a hook execution, so by the time the collect-status event is executed after the hook is done, things will be in a stable state.

@sed-i
Copy link
Contributor

sed-i commented Oct 17, 2023

Thanks @benhoyt!

Custom events are fired "synchronously" during a hook execution, so by the time the collect-status event is executed after the hook is done, things will be in a stable state.

That is correct if we only add_status in the charm's collect_status. If components (such as charm libs) observe that event, we may be in trouble. But you rightfully commented that best practice is to only do it in the charm. Currently both are undocumented pitfalls iirc.

@benhoyt
Copy link
Collaborator

benhoyt commented Oct 23, 2023

For reference, see canonical/alertmanager-k8s-operator#202 for an example of the kind of place this crops up.

@PietroPasotti
Copy link
Contributor Author

this came up in juju/juju-controller#56 as well

@gruyaume
Copy link

@benhoyt is there any effort placed on this bug? This makes harness less reliable in terms of validating that the test results match real life expectations.

We have to do these types of workarounds to avoid the issue:

@tonyandrewmeyer
Copy link
Contributor

Hi @gruyaume this is on our work list for the current cycle, and we still expect it to be completed before Madrid.

@gruyaume
Copy link

Hi @gruyaume this is on our work list for the current cycle, and we still expect it to be completed before Madrid.

Awesome, that's great to know!

cc @dariofaccin

@tonyandrewmeyer
Copy link
Contributor

We can have Harness instantiate a new charm object (and have the framework forget the old one, or also create a new Framework object) when it handles the emitting. However, we are left with significant problems.

Analysis

I tried three main variants of having a fresh charm instance for each (Juju) hook:

  • One that created a fresh charm when the event was being emitted
  • One that created a fresh charm after the event was emitted, so that if there was a second (Juju) event the charm would be fresh, but you don't lose anything done between begin and the event
  • One that special-cased begin_with_initial_hooks so that it would only "reinitialise" once rather than 4+ times

Running this version of Harness against my collection of Canonical charms (currently 144, around 6 of which fail with the latest version of ops), the best case was the latter, and still 61 charms had failing tests, so >50 more than without the change. That's more breakage than we're willing to accept (or to help mitigate).

Some of the issues are actually problems with the charm (for example, there's a charm that does harness._charm when it should do harness.charm - I'll open a PR to fix that), and it's usually a fairly small number of tests that fail, not all of them. The failures all seem resolvable (e.g. by changing the way that mocking is done). However, this is a lot of adjustment across a lot of repos and teams.

I also tried one other approach, where I tracked the attributes set on the charm and raised an error if one was accessed that wasn't present during __init__ (more on this below).

sdcore-upf-k8s

The most recent example in the ticket where this is a problem - and one where the tests are currently doing a charm reinitialisation, is score-upf-k8s.

The tests all pass with ops 2.11. 9 (of 76) tests fail if the code that reinitialises the charm is removed. 6 tests still fail when using the code from above, because it's not cherry-picking when to reset like the existing code, it's always doing it.

Issues

Charm tests sometimes do the emit() themselves

If the test code looks like this:

harness = ops.testing.Harness(MyCharm)
harness.begin()
harness.charm.on.install.emit()
# some asserts
harness.charm.on.start.emit()

The Harness object is not involved between those two emits, so there's no opportunity to do the charm reset, without really ugly things like Harness monkeypatching emit(). (This doesn't have to be an actual monkeypatch, since Harness actually creates a fresh charm class and could replace .on with something similar but different, but it feels the same to me).

Without patching emit(), there's no solution here other than writing the charm tests in a way that doesn't assume that changes to the charm are consistent (see below for why that can be legitimate in tests). We could add to Harness to make this a nice experience, but we can't make it backwards compatible, and our future focus is more on Scenario than on Harness.

I think in general, if you're using Harness it would be nicer to avoid tests written in this style, and limit yourself to a single emit() (whether explicit or via one of the Harness methods) - e.g. have your fixture or setUp create the Harness object and then have other fixtures or utility functions that arrange it appropriately, so that you are almost always using a fresh charm.

Some tests expect to be able to assert on the charm after the event

Test code that looks like this:

harness = ops.testing.Harness(MyCharm)
harness.begin()
harness.add_relation("foo")
assert harness.charm.x == y

I believe this is reasonable to do, and it means that we cannot do the charm reset after Harness emits events, only before.

This is also an issue when a single Harness call emits multiple events - in particular this happens with begin_with_initial_hooks() (at minimum 4 emit()s, plus more if there are containers, storages, relations, etc) and add_relation() when supplying unit data.

We could special-case begin_with_initial_hooks() (and add_relation()) so that they only reset the charm once, although this then leaves us in a strange middle ground where the charm is reset except in some cases. However, that doesn't meaningfully decrease the number of existing charms' tests that doing a reset breaks.

Some tests make changes to the charm object

There are tests that look like this:

class MyCharm(ops.CharmBase):
    def __init__(self, framework):
        super().__init__(framework)
        self.foo = some.lib.object()

def test_something():
    harness = ops.testing.Harness(MyCharm)
    harness.begin()
    harness.charm.foo = Mock()  # probably actually autospec'ing
    harness.add_relation("bar")
    assert ...

I believe this is also a reasonable thing to do, and it means that we cannot do the charm reset after Harness emits events, only after (directly contradicting the above finding). If there's code like assert harness.charm.foo.called_once() then it can't be done after, either.

This can be avoided by doing the mocking differently - instead of mocking the charm instance attribute, you can mock the class that it's instantiating and assigning to that attribute, for example. However, there are plenty of charms that have the existing behaviour, and I don't feel that we can legitimately expect them to change (in a minor release).

Potential Solutions

Scenario

We could definitely design an API for Harness that removed the ability to modify the charm - for example, we could have a .run method (like _Manager.run) that created the framework and charm instance and did the emitting, and have charmers use that rather than .emit, and also have Harness use it rather than directly emitting. We could have a context manager that gave access to the charm but restricted it to a single emit. There are plenty of other possibilities.

(There's also the tool that Pietro linked above, although I think generally you'd just use Scenario where those ideas moved to).

However, going forward, our unit testing story is focused on Scenario. Scenario was designed with this issue in mind. Regular use of Scenario, such as:

ctx = scenario.Context(MyCharm)
state2 = ctx.run(event, scenario.State())
assert ...
state3 = ctx.run(event, state2)

does not have this issue, because a fresh charm is created each time and the test code does not have access to it.

If you do need to get access to the charm, then you can do this:

ctx = scenario.Context(MyCharm)
with ctx.manager(event, scenario.State()) as mgr:
    mgr.charm.on = Mock()
    mgr.run()
    assert mgr.charm.on.called_once()

The run() method of the context manager can only be called once, so, although you have access to the charm, only one (Juju) event will be emitted.

Harness test style

It's possible to write Harness tests that would catch charm code that attempts to persist data on the charm instance. Essentially, all you need to do is ensure that you have one (Juju) event per Harness instance (which probably means per test in most situations). Strictly speaking, you should either avoid begin_with_initial_hooks() or ensure that you also have tests that validate each of the (Juju) events that triggers, if it's relevant to the charm. You should also avoid passing unit data to add_relation(), adding the relation before begin() (if you're testing the relation-changed event).

I feel like writing tests in this style is generally a good approach, but not something that we would want to try and enforce or that would apply to every situation. If people are going to put effort into rewriting tests, we would probably rather than they look into writing Scenario (with the 7.x API) tests instead.

Detecting access during test execution

Rather than reinitialising the charm, we could have Harness identify when charms are relying on this behaviour, a little similar to how __slots__ works. Harness could add __getattribute__ and __setattr__ to the charm and raise an error when there's a get before a set, and reset that counter on emitting (Juju) events (by patching the emit method).

I did a proof-of-concept implementation of this approach and didn't get any new charm tests failing, but this example does (on print(self.bar)):

class MyCharm(ops.CharmBase):
    def __init__(self, framework):
        super().__init__(framework)
        self.foo = 1
        framework.observe(self.on.install, self._on_install)
        framework.observe(self.on.start, self._on_start)
    def _on_install(self, event):
        self.bar = 2
        print(self.foo)
    def _on_start(self, event):
        print(self.bar)

harness = ops.testing.Harness(MyCharm)
harness.begin()
try:
    harness.charm.on.install.emit()
    harness.charm.on.start.emit()
finally:
    harness.cleanup()

I could go further into this approach (I haven't spent a huge amount of time on it, so there may be flaws, and perhaps it's not catching everything it should and if so would trigger failures - it does fail on one test_testing test because it asserts on an attribute that's added during the event, for example). It does only address the "people are wrongly relying on attributes persisting across events" issue and not anything else raised in this ticket, though.

Static checking

Similar to the above, we might be able to build a static check as part of the linter development we're considering. 100% confidence is probably too hard to do statically, but I think it wouldn't be too difficult to have a linter warn when it seems like this might be happening. I haven't explored this in any detail, though.

Opt-in

We could add opt-in support to Harness to make this work - the opted in behaviour would break backwards compatibility in some ways, but allow people to use this behaviour if they wanted to. Pietro links to an old PR above, or I could do a fresh one based on what I had done here, that's based off the more recent ops codebase.

However, when Charm-Tech discussed this, we didn't feel that charms generally would opt in to new behaviour. If someone really does want behaviour like this, then we feel like pointing them at Scenario is the better choice.

Other Notes

__init__ behaves differently per event

For the case in the ticket description, where the Charm initialises differently depending on the event: I think we should discourage this as bad practice (basically, what John says). If there is some use-case we're missing there, then I think we could just have an explicit "regenerate charm" Harness method (it would likely also need to regenerate the framework or have the framework forget the old charm and its observers), and people using this pattern could adjust their tests accordingly. I think it would be better to have a ticket specifically for that (with real-world use-cases) rather than mixing it with the "avoid people setting attributes in one event and using them in another" case as in this ticket.

Harness doesn't fire framework events

One of the comments above mentioned that Harness doesn't fire the framework events (pre-commit, commit, collect-status). For collect-status, we have Harness.evaluate_status, which seems sufficient (and I don't think we have heard otherwise). For pre-commit/commit, we had #1115 where we could have added those events (which would have triggered the type check). Doing this for events that Harness emits is simple enough (replacing the various emit() calls with a wrapper that emits and also commits) - doing it when someone does e.g. harness.charm.on.start.emit() can be done but is a lot more work. I think we'd still recommend just using Scenario if you need to test this behaviour.

Deferred events do reuse the charm

There's a little bit of discussion above that suggests making deferred events less special, so they are using a "fresh" charm instance. This would be a change to main rather than to testing. I feel that there is some merit to this, and it seems unlikely that it would cause breakage. There would be a small performance penalty added to deferred events, but that shouldn't be noticeable. However, if we decide we want to do this, I think it should be done in a separate ticket and PR, aimed at making deferred events more consistent, rather than being about Harness. It might be worth a discussion now that we've settled on some defer use being appropriate, but it also might not be worth it if we're going to be able to add a Juju re-emit tool in the near future.

Custom events also reuse the charm

There's also discussion about custom events. I'm less convinced about these than deferred ones, but I do feel like breaking that out into a separate ticket and PR would make most sense (probably even separate from and after one for deferred events). If we have a fresh charm for custom events and for deferred events then non-Harness use would be consistent (and likely Scenario could be also), and we have to accept that Harness wasn't designed with this in mind, and works differently. On the other hand, we do currently say that setting charm attributes during event handlers that later get used by collect-status (to avoid duplicate checks) is ok, and that would break.

I do feel like pushing in that direction is the wrong way - it would be so much better if ops didn't reinitialise on every event. However, changes in that area don't seem close, and consistency is definitely an improvement.

Lost context

Some of the examples linked here have expired, unfortunately - they are to CI logs that have expired, or are to charms where the code has moved on (rather than being to a specific revision), or were changed before merged (e.g.). I could spend time chasing up each of those with the teams but it doesn't seem like it would add a lot of additional context here - but it does suggest that we might need to do more work on collecting specifics in the ticket rather than relying on external information that might not last.

For what it's worth, I can only find one charm that is still doing any sort of framework._forget, which makes it likely that people aren't reinitialising charms in that sort of way (you could perhaps use .clear or do what _forget does directly, instead). The one is sdcore-upf-k8s-operator (which is not in the list of Canonical charms, so missed in my earlier analysis).

Recommendations

  1. Open a ticket for having deferred events get a fresh charm, socialise that the ticket exists, and put it on the roadmap if it gets enough support ( ❤️ 's, maybe?) within some time period. If not, close the ticket and use it as a place to point people when they ask about it.
  2. Charm-Tech discussion about whether I should invest more time into the "detect improper attribute access" approach or not.
  3. Put lots of energy into Scenario, and really encourage people to make use of it.
  4. Open a ticket for giving Harness the ability to regenerate the charm instance (so that people can make use of this if they want to), socialise that the ticket exists, and consider putting it on the roadmap if it gets enough support within some time period. I can re-use work I've done here to implement it (or at least re-use thinking).
  5. Ensure that people know they can come to Charm Tech if they run into this problem, and we can provide advice on how to refactor code or tests to avoid it. This could maybe be a doc, but I don't have a clear picture on what that would look like.
  6. Open a ticket for having Harness emit framework events, socialise that the ticket exists, and consider putting it on the roadmap if it gets enough support within some time period. I can re-use work/thinking I've done here on this.
  7. I spend time looking at sdcore-upf-k8s-operator and come up with an alternative way to do the tests in Harness that avoids needing to do the reset and submit that as a PR, and use that as a basis for guidance for others.

I realise that this isn't the outcome of this ticket that we were hoping for 😞. It's also a concern that there are recent comments (<1 month) about hitting problems around this.

@benhoyt
Copy link
Collaborator

benhoyt commented Apr 1, 2024

Thanks @tonyandrewmeyer for the thorough analysis and write-up. Let's discuss further in our 1:1 which of these recommendations we should go further with, if any, and go from there. One thing I think would be useful (in our chat or before) is to look at the recent comments from me, Pietro, and Guillaume pointing to specific issues, and see how you'd recommend they address those specific issues.

@tonyandrewmeyer
Copy link
Contributor

Summary, after @benhoyt and I discussed this in detail:

  • We agree that it would be much better if Harness did reinitialise the charm, so that the behaviour matched production, and we really did want to provide that. However, we can't do it without breaking a large number of charm tests, or by introducing a new API. We're not willing to do the former, and for the latter we believe that Scenario already provides a compelling choice, and it makes more sense to write a Scenario test than for us to build a (probably Scenario-like) new API on Harness. This is unfortunate, but the Harness design has us painted into a corner.
  • We agree that a fresh charm for deferred events would be a good change, but it's out of scope for this "charm initialisation in Harness" ticket. Opened Run deferred Juju events on a fresh charm instance #1174.
  • We're not going to invest more time into detecting people wrongly re-using charm attributes across Juju events. We encourage people to have some Scenario tests, which will catch this behaviour.
  • We decided against adding a "reinitialise charm" API to Harness. Scenario has this behaviour already, and we'd rather that people spent their resources adding Scenario tests (and Charm Tech spent resources on Scenario improvements).
  • I'll open an issue or PR for suggestions regarding score-upf-k8s-operator. I have only lightly looked over the charm/tests so far, but it seems like the need for reinstantiating the charm can be avoided with a few small changes.
  • We believe that the charm's __init__ should not have code that's conditional on the Juju event (this includes conditional observing). Although the charm is instantiated for each Juju event (outside defer) right now, that may not always be the case, and it's not the ops design. We'll work on improving how the ops story is told to try to improve understanding (in general, no specific action item here). Charms can have guards on observers (including decorators), can have an observer set charm attributes, and so on. We're happy to provide advice to charmers on specific issues around this, and if there are patterns then we'll try to codify those in docs.
  • We're open to having support for Harness to trigger the LifecycleEvents (pre-commit, commit). However, we again recognise that Scenario already provides this for anyone that needs it, and we feel that these are not widely used, so it's not something we should focus on at the moment. If anyone wants this, it should be a new ticket, since it's separate from (although related to) "charm initialisation in Harness".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs design needs more thought or spec
Projects
None yet
Development

No branches or pull requests

10 participants