Added "planned unit count" to model. #597

pengale · 2021-09-01T20:13:14Z

This is the simplest possible implementation of goal state, designed
to give folks a way to access goal state info, without implementing a
more complete representation of goal state.

This is the simplest possible implementation of goal state, designed to give folks a way to access goal state info, without implementing a more complete representation of goal state.

jnsgruk

Overall looks good, but some minor changes requested (open to discussion, though :))

ops/model.py

ops/testing.py

jameinel · 2021-09-02T14:55:32Z

So the main caveat for "adding GoalState to the model" is that Gustavo felt that what Juju was exposing with Goal State (especially wrt the unit names, etc) was inappropriately 'peeking into a potential future that may never exist'. Having the number of units (planned_unit_count) was much more cohesive and addressed the very real user need, without exposing too many details. (The key one being "am I an HA cluster and should wait for quorum, or am I standalone and should just get started") The issue being that if you 'juju deploy ha-app -n 3", each unit comes up, and until it has gotten to start, it hasn't joined its relations, and it *doesn't* see and *doesn't* show up to the other units. So each unit sees itself as a standalone. Obviously there were other discussions around goal-state wrt relations, etc, which is why that data was exposed originally. But I'll also note that we never 'completed' that work, as there isn't a hook that fires when goal-state changes (has the related unit become unblocked?). Which means you can't actually reliably trust those fields. I think we wanted to be conservative with the initial implementation and not yet expose the expected number of remote units, nor their state. And focus on the first necessary step. And charms that *need* to know about the remote unit count can use the local information to put that into relation data, which ensures that events get triggered for it.

…

On Thu, Sep 2, 2021 at 10:38 AM Jon Seager ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In ops/model.py <#597 (comment)>: > @@ -176,6 +176,18 @@ def get_binding(self, binding_key: typing.Union[str, 'Relation']) -> 'Binding': """ return self._bindings.get(binding_key) + def get_planned_unit_count(self) -> int: Hmm. Interesting for a couple of reasons: - Each unit has a status, meaning we can easily derive planned_units and pending_units, where planned units is len(self._units) and pending units is something line len([u for u in self._units if u["status"] != "active"), right? - I'm still unconvinced this is a model level construct, and not an application level construct, precisely because you can't interrogate the goal state of *another application*, only yourself if that makes sense? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#597 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABRQ7OFOY4TB7USLSSTLP3T76D5PANCNFSM5DHJHE3Q> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

jnsgruk · 2021-09-02T14:57:15Z

@jameinel Yeh I think any reference to relations/remote units should be kept out of scope for now.

sed-i · 2021-09-02T15:19:35Z

The issue being that if you 'juju deploy ha-app -n 3", each unit comes up,
and until it has gotten to start, it hasn't joined its relations, and it
doesn't see and doesn't show up to the other units. So each unit sees
itself as a standalone.

My impression was that planned_unit_count is for reducing/preventing an "event storm", and was not intended to provide a reliable facility for executing some code only once. iiuc, the same charm code should run fine and produce the same result eventually, with or without relying on planned_unit_count.

jameinel · 2021-09-02T18:44:03Z

Code should perform correctly. The most common failure without it is a unit coming up thinking that it needed to initialize a node only to find out later that it actually needed to be part of an HA system. They should, certainly, eventually become correct either way, but you want to avoid telling other charms to start sharing their data if you don't have quorum (I believe rabbitmq fits here). Ceph is a different example where it cannot do a good job of changing layout on the fly. And *does* need to know the final layout.

…

On Thu, Sep 2, 2021 at 11:19 AM sed-i ***@***.***> wrote: The issue being that if you 'juju deploy ha-app -n 3", each unit comes up, and until it has gotten to start, it hasn't joined its relations, and it *doesn't* see and *doesn't* show up to the other units. So each unit sees itself as a standalone. My impression was that planned_unit_count is for reducing/preventing an "event storm", and was *not* intended to provide a reliable facility for executing some code only once. iiuc, the same charm code should run fine and produce the same result eventually, with or without relying on planned_unit_count. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#597 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABRQ7KXYKLQ6J3QSIQIN5LT76IZHANCNFSM5DHJHE3Q> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

sed-i · 2021-09-02T19:12:34Z

you want to avoid telling other charms to start sharing their data if you don't have quorum

Makes sense for a startup sequence, but wouldn't juju scale-application still have the same challenge?
So given the charm must be robust enough to handle incremental scale-application, it seem like planned_unit_count is just for reducing the "event storm". Am I missing something?

get_planned_unit_count -> planned_units. Moved where we expose this to the Application class.

Hat tip to justinmclark's earlier PR, where he figured out the best way to test this.

ops/testing.py

jameinel · 2021-09-03T16:48:45Z

It is, indeed, the case. With the caveats: 1) While the app is running, reconfiguring can be done, but it may also be expensive 2) Scale is actually relatively infrequent, so 'quick' charms are likely to not spend a lot of developer effort to handle all edge case 3) The biggest problem is the case where all 3 units are starting, but consider themselves to be the only unit, so you have a period of time where you are, essentially, in split-brain mode without realizing it. This case is quite different from scale. It somewhat is because Juju doesn't expose other units to you until you have 'joined', and then we expose them one-by-one, rather than exposing the most up-to-date-information at each point. But doing it the other way would lead to other issues about having to evaluate too much of the model, and worrying about everything all the time, instead of just incremental changes.

…

On Thu, Sep 2, 2021 at 3:12 PM sed-i ***@***.***> wrote: you want to avoid telling other charms to start sharing their data if you don't have quorum Makes sense for a startup sequence, but wouldn't juju scale-application still have the same challenge? So given the charm must be robust enough to handle incremental scale-application, it seem like planned_unit_count is just for reducing the "event storm". Am I missing something? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#597 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABRQ7KMGEN4E7VH2FJZ6ATT77EC5ANCNFSM5DHJHE3Q> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

rbarry82 · 2021-09-03T22:25:18Z

planned_unit_count is also for sanity. Sure, ideally, Charms will handle scale-application. But a charm which requires quorum via Raft is pretty useless as a single/double unit, whether or not it can gracefully scale endlessly at 3+. Telling something which talks to etcd "hey, I'm ready!" when you have a single node, then telling your relations "wait, I'm electing a leader. Wait, I'm electing another leader. Wait, I'm partitioning data" in the next 30 seconds after you said "hey, I'm ready!" is exactly the kind of "event storm" this is trying to avoid.

Once you say "hey, I'm ready!" from Elastic or Cassandra or whatever to some other application, that application will probably start trying to create keys/indexes/tables/whatever, and re-partitioning the data is an expensive process which may or may not block off or dramatically slow external communications, with the admin guide suggesting that you set up a maintenance window for it. Now your "other" charm fails startup and reports an error back to Juju because something went wrong which never should have started in the first place.

Similarly, adding an OSD to Ceph requires adding things to the keyring, adding it to the CRUSH map, and waiting for data to rebalance. This is expensive. If you wait until everything is ready, no rebalance. Or even if you wait until there's very ittle/no data, it's a cheap/fast operation. Telling Cinder/Glance/Swift that Ceph is ready cascades if you have a single node. Now, hypothetically, Swift comes up and thinks it's ready. And something which is related to Swift starts writing data.

And it's a bottleneck/race between "how fast can application X which is writing data to Ceph through Swift perform when Ceph is trying to add a node and rebalance".

Whether or not a charm can handle scale-application is sort of irrelevant from this sense. Just because it can does not mean it should have to, particularly at initial deployment of a bundle where Juju already knows that there are going to be X Ceph/Cassandra/Elastic/Rabbit/Mongo/whatever nodes. It's better for performance, reliability, "event storm", administrator observability, and everything else to simply wait until the HA/distributed application is really "ready".

jameinel · 2021-09-05T13:10:27Z

On Fri, Sep 3, 2021 at 6:13 PM Ryan Barry ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In ops/model.py <#597 (comment)>: > @@ -176,6 +176,18 @@ def get_binding(self, binding_key: typing.Union[str, 'Relation']) -> 'Binding': """ return self._bindings.get(binding_key) + def get_planned_unit_count(self) -> int: Coming at it from another angle, I saw the "goal" of goal-state as "I'm going to be HA and I should be ready for that". For the Juju core goal state of "a mysql, a webserver", the *overall* state of the model probably makes sense. From an OF perspective, there are a couple of cases. They're all different from an application design perspective, and we cannot plan for all of them, but let's say we have the following cases: - A Grafana 'cluster' which will want to know its end goal state so it does not bother initializing data in a local sqlite database, then immediately needing to shut down and migrate everything over to MySQL. Grafana needs to wait for a relation to MySQL in any case, which can be handled from the charm code itself, but instead of checking "do I have a relation to a mysql || do I have peer relations" and branching the logic out, it can interrogate "do I have more than one unit planned? If so, just wait on DB initialization until I have a MySQL" So the 'relation-created' hook was introduced to handle this particular

case. "Let me know that there *is* a relation to a database, even if the database isn't up and running yet". 'relation-created' triggers just after 'install' so you can be informed very early on to expect that there will be a relation.

- A Ceph cluster which "needs" to know how many initial nodes it should have. Sure, Ceph *can* scale up and down, but it requires a bunch of twiddling to add/remove OSDs on the fly, and waiting for monitor initialization until all of the units are present is much easier. - A Cassandra cluster which "needs" to know how many nodes will be present before initialization so data can be appropriate sharded. Same basic case as Ceph. - A etcd cluster or anything which uses Raft which won't reach quorum without a minimum number of nodes This is an interesting one, as you can imagine that a good charm should be

able to initialize with a single node, and then grow the configuration to whatever HA shows up. Having that ability also implies that it can handle day 2 operations when you need to take a node out of the cluster, or scale up from 1 to 3 to 5 over time. Goal-state was somewhat designed to handle this case, but I think it actually makes more sense to just use 'is_leader' instead. If you are the leader, and you don't see any other nodes, that's fine, initialize with N=1. When you see the other nodes, add them to the cluster. Raft has it as an explicit config change that is coordinated with quorum of the current cluster, but you should always be able to add/remove 1 node at a time. I think the problem we saw in the past was actually because they weren't looking at is_leader, and so each unit saw themselves as a single node deploy.

- A RabbitMQ cluster where node start ordering matters if the entire cluster goes down (this really doesn't apply to k8s charms, but still) - Graylog "wants" both ElasticSearch and MongoDB, both of which "want" to have at least 3 nodes before they are "ready" All of these cases are slightly different, but all align on a couple of points: - Communication about when the overall application is "ready" to communicate to other charms to reach the *Juju* goal state rather than the *charm* goal state can be done by setting relation data when it is "ready" - For charms which support iterative startup (like Rabbit through rabbitmqctl), this can be performed over peer relations, and, again, readiness communicated to external relations via relation-set Yes, goal state can help avoid the "event storm" Leon referenced, but primarily, from my POV, by having whatever charm depends on its relation counterpart (be that Graylog waiting for an Elastic and Mongo, Grafana waiting for a MySQL) put itself into WaitingStatus. I used these two as examples because Graylog requires multiple external relations to achieve their goal state before it can be initialized, and Grafana needs to wait for a single one (assuming a single MySQL). From this perspective, we can avoid an "event storm" by, simply, not dumping a bunch of relation data over to external relations until quorum/initialization is complete. That's really up to charm authors, since they know when application X or Y is "ready" to talk to the outside world, and charms which require quorum before they can operate shouldn't be blindly firing off relation data on startup/relation-joined without checking whether they're operational first. From an OF perspective, it's not reasonable to provide a complete abstraction for everything the entire application *model* may need, but we *can* provide an abstraction for what a *single* application may need. Whew, that got long. But in the end, I agree that this should probably be on Application for the reasons above.

I feel like we definitely got to the same place. Ceph seems to be the case that actually needs the unit count known a-priori. Most of the others can actually leverage leadership to make sure they don't get initialized in split-brain, and then use the relation-joined hooks to grow the cluster. And for ones that want to avoid the 'use sqlite instead of remote SQL db', they have relation-created. John =:->

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#597 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABRQ7NCQMOVYJH5XEE624DUAFB7HANCNFSM5DHJHE3Q> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

niemeyer

This is looking good. Just some comments around comments and concepts:

niemeyer · 2021-09-08T17:27:11Z

ops/model.py

+        planned unit count for foo would be 3.
+
+        We deliberately do not attempt to inspect whether these units are actually running
+        or not. That is a task left up to the future, when goal state is more mature.


It'd be nice to not refer to "goal state" here. We want to kill that idea altogether in juju, in the sense that the number of pending units is just part of the current state like everything else.

niemeyer · 2021-09-08T17:27:30Z

ops/model.py

+    def planned_units(self) -> int:
+        """Count of "planned" units that will run this application.
+
+        We use goal-state here, in the simplest possible way. When we implement goal state


Same as above.

niemeyer · 2021-09-08T17:29:12Z

ops/model.py

+        Includes the current unit in the count.
+
+        """
+        goal_state = self._run('goal-state', return_output=True, use_json=True)


Might also be worth adding a comment before that line so future readers can understand that perspective. Something along the lines of:

# goal-state as a concept in juju is dying in favor of it being simply the current state, # so we must not use this API further outside of explicitly designed and agreed cases.

Makes it clear that goal state is deprecated.

jnsgruk · 2021-09-10T15:26:40Z

Nice work, thanks @petevg

Added "planned unit count" to model.

6cff25f

This is the simplest possible implementation of goal state, designed to give folks a way to access goal state info, without implementing a more complete representation of goal state.

jnsgruk requested changes Sep 1, 2021

View reviewed changes

ops/model.py Outdated Show resolved Hide resolved

ops/model.py Show resolved Hide resolved

jameinel reviewed Sep 1, 2021

View reviewed changes

ops/model.py Outdated Show resolved Hide resolved

ops/testing.py Outdated Show resolved Hide resolved

pengale added 3 commits September 3, 2021 11:00

Renamed routines and moved code

541ed4b

get_planned_unit_count -> planned_units. Moved where we expose this to the Application class.

Added tests

d39d4b0

Hat tip to justinmclark's earlier PR, where he figured out the best way to test this.

Fixed test cleanup

e230170

jnsgruk reviewed Sep 3, 2021

View reviewed changes

ops/testing.py Outdated Show resolved Hide resolved

pengale and others added 3 commits September 8, 2021 10:18

Minor changes.

cefdeef

Fixed style issue.

6741dba

Merge branch 'master' into pending-units

1cce1af

pengale requested a review from jnsgruk September 8, 2021 14:56

jnsgruk approved these changes Sep 8, 2021

View reviewed changes

jnsgruk requested a review from rbarry82 September 8, 2021 15:34

niemeyer approved these changes Sep 8, 2021

View reviewed changes

pengale and others added 2 commits September 10, 2021 09:29

Updated language around goal state

3c0b1c6

Makes it clear that goal state is deprecated.

Merge branch 'master' into pending-units

cf5fde5

jnsgruk merged commit 2875e73 into canonical:master Sep 10, 2021

sed-i mentioned this pull request Sep 12, 2021

use goal state before updating relation data canonical/alertmanager-k8s-operator#31

Open

pengale mentioned this pull request Sep 16, 2021

Need a way to determine the expected count for a related application (or peers) #165

Closed

jnsgruk mentioned this pull request Sep 28, 2021

Expose goal-state #434

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added "planned unit count" to model. #597

Added "planned unit count" to model. #597

pengale commented Sep 1, 2021

jnsgruk left a comment

jameinel commented Sep 2, 2021 via email

jnsgruk commented Sep 2, 2021

sed-i commented Sep 2, 2021

jameinel commented Sep 2, 2021 via email

sed-i commented Sep 2, 2021

jameinel commented Sep 3, 2021 via email

rbarry82 commented Sep 3, 2021 •

edited

Loading

jameinel commented Sep 5, 2021 via email

niemeyer left a comment

niemeyer Sep 8, 2021

niemeyer Sep 8, 2021

niemeyer Sep 8, 2021

jnsgruk commented Sep 10, 2021

Added "planned unit count" to model. #597

Added "planned unit count" to model. #597

Conversation

pengale commented Sep 1, 2021

jnsgruk left a comment

Choose a reason for hiding this comment

jameinel commented Sep 2, 2021 via email

jnsgruk commented Sep 2, 2021

sed-i commented Sep 2, 2021

jameinel commented Sep 2, 2021 via email

sed-i commented Sep 2, 2021

jameinel commented Sep 3, 2021 via email

rbarry82 commented Sep 3, 2021 • edited Loading

jameinel commented Sep 5, 2021 via email

niemeyer left a comment

Choose a reason for hiding this comment

niemeyer Sep 8, 2021

Choose a reason for hiding this comment

niemeyer Sep 8, 2021

Choose a reason for hiding this comment

niemeyer Sep 8, 2021

Choose a reason for hiding this comment

jnsgruk commented Sep 10, 2021

rbarry82 commented Sep 3, 2021 •

edited

Loading