[sled-agent] Don't block InstanceManager on full MPSCs #6913

hawkw · 2024-10-22T18:52:02Z

Sled-agent's InstanceManager task is responsible for managing the
table of all instances presently running on the sled. When the
sled-agent receives a request relating to an individual instance on the
sled, it's sent to the InstanceManager over a tokio::sync::mpsc
channel, and is then dispatched by the InstanceManager to the
InstanceRunner task responsible for that individual instance by
sending it over a second tokio::sync::mpsc channel. This is where
things start to get interesting.¹

tokio::sync::mpsc is a bounded channel: there is a maximum number of
messages which may be queued by a given MPSC channel at any given time.
The mpsc::Sender::send method is an async fn, and if the channel is
at capacity, that method will wait until there is once again space in
the channel to send the message being sent. Presently,
mpsc::Sender::send is called by the InstanceManager's main run loop
when dispatching a request to an individual instance. As you may have
already started to piece together, this means that if a given
InstanceRunner task is not able to process requests fast enough to
drain its channel, the entire InstanceManager loop will wait when
dispatching a request to that instance until the queue has been drained.
This means that if one instance's runner task has gotten stuck on
something, like waiting for a Crucible flush that will never complete
(as seen in #6911), that instance will prevent requests being dispatched
to any other instance managed by the sled-agent. This is quite
unfortunate!

This commit fixes this behavior by changing the functions that send
requests to an individual instance's task to instead shed load when
that instance's request queue is full. We now use the
mpsc::Sender::try_send method, rather than mpsc::Sender::send, which
does not wait and instead immediately returns an error when the channel
is full. This allows the InstanceManager to instead return an error to
the client indicating the channel is full, and move on to processing
requests to other instances which may not be stuck. Thus, a single stuck
instance can no longer block requests from being dispatched to other,
perfectly fine instances.

The error returned when the channel is at capacity is converted to an
HTTP 503 Service Unavailable error by the API. This indicates to the
client that their request to that instance was not able to be processed
at this time, but that it may be processed successfully in the
future.² Now, we can shed load while allowing clients to retry later,
which seems much better than the present situation.

In the sense of "may you live in interesting times", naturally. ↩
I also considered returning 429 Too Many Requests here, but my
understanding is that that status code is supposed to indicate that
too many requests have been received from that specific client. In
this case, we haven't hit a per-client rate limit; we're just
overloaded by requests more broadly, so it's not that particular
client's fault. ↩

Sled-agent's `InstanceManager` task is responsible for managing the table of all instances presently running on the sled. When the sled-agent receives a request relating to an individual instance on the sled, it's sent to the `InstanceManager` over a `tokio::sync::mpsc` channel, and is then dispatched by the `InstanceManager` to the `InstanceRunner` task responsible for that individual instance by sending it over a *second* `tokio::sync::mpsc` channel. This is where things start to get interesting.[^1] `tokio::sync::mpsc` is a *bounded* channel: there is a maximum number of messages which may be queued by a given MPSC channel at any given time. The `mpsc::Sender::send` method is an `async fn`, and if the channel is at capacity, that method will _wait_ until there is once again space in the channel to send the message being sent. Presently, `mpsc::Sender::send` is called by the `InstanceManager`'s main run loop when dispatching a request to an individual instance. As you may have already started to piece together, this means that if a given `InstanceRunner` task is not able to process requests fast enough to drain its channel, the entire `InstanceManager` loop will wait when dispatching a request to that instance until the queue has been drained. This means that if one instance's runner task has gotten stuck on something, like waiting for a Crucible flush that will never complete (as seen in #6911), that instance will prevent requests being dispatched to *any other instance* managed by the sled-agent. This is quite unfortunate! This commit fixes this behavior by changing the functions that send requests to an individual instance's task to instead *shed load* when that instance's request queue is full. We now use the `mpsc::Sender::try_send` method, rather than `mpsc::Sender::send`, which does not wait and instead immediately returns an error when the channel is full. This allows the `InstanceManager` to instead return an error to the client indicating the channel is full, and move on to processing requests to other instances which may not be stuck. Thus, a single stuck instance can no longer block requests from being dispatched to other, perfectly fine instances. The error returned when the channel is at capacity is converted to an HTTP 503 Service Unavailable error by the API. This indicates to the client that their request to that instance was not able to be processed at this time, but that it may be processed successfully in the future.[^2] Now, we can shed load while allowing clients to retry later, which seems much better than the present situation. [^1]: In the sense of "may you live in interesting times", naturally. [^2]: I also considered returning 429 Too Many Requests here, but my understanding is that that status code is supposed to indicate that too many requests have been received from *that specific client*. In this case, we haven't hit a per-client rate limit; we're just overloaded by requests more broadly, so it's not that particular client's fault.

hawkw · 2024-10-22T18:57:30Z

Since none of this code is really meaningfully exercised by the Omicron test suite (due to the pervasive use of the sled-agent simulator), I'd like to do some manual testing of this change (maybe using omicron-stress) before merging it. However, I figured I'd go ahead and put the PR up for review now.

This should hopefully fix the highest-level symptoms described in #6911, but I'm not sure if I want to have this PR close that issue, since the root cause (Crucible getting stuck when trying to perform a snapshot after a disk has disappeared) still needs to be addressed. Also, a potential follow-up from this that I'd like to address is changing how instances are terminated, so that there's a separate mechanism for terminating an instance that takes priority over whatever else it might be doing, which would have better addressed the specific issue in #6911 where a request to expunge the failed disk could not proceed because the instance's task had gotten wedged waiting for the Crucible snapshot to finish, which was unable to complete...because of the bad disk.

hawkw · 2024-10-22T18:57:49Z

Hi @smklein! :)

hawkw · 2024-10-22T19:05:19Z

sled-agent/src/instance.rs

@@ -1092,13 +1150,48 @@ fn propolis_error_code(
 }

 /// Describes a single Propolis server that incarnates a specific instance.
+#[derive(Clone)]


The intent behind deriving Clone for this is to allow the instance's handle to be cloned into tasks spawned by the InstanceManager for the functions that need to send multiple requests to the instance (or requests to multiple instances) rather than just passing in the tx from the original request (e.g. use_only_these_disks). I didn't actually do that in this PR but intend to do so in a follow-up.

as a tiny nitpick I might punt on deriving clone until that PR, because it doesn't seem necessary in this one.

(but also, not a big deal if this is a pain in the butt)

hawkw · 2024-10-22T19:07:55Z

sled-agent/src/instance_manager.rs

            .map_err(Error::from)
    }

    /// Create a zone bundle from a named instance zone, if it exists.
-    async fn create_zone_bundle(
+    fn create_zone_bundle(


While working on this change, I noticed that the error handling in the rest of this function is a little bit off. We presently return errors from create_zone_bundle if the provided zone name is invalid, or if there's no instance for that UUID. These errors are bubbled up into the InstanceManager run loop, which logs an error...but the response tx oneshot just gets dropped, rather than receiving these errors.

This means that the client gets a generic "response dropped" error rather than the more useful error indicating that the zone name is bad or doesn't exist. We should change this function so the errors are sent back over the channel instead, but I felt like that was better done in a separate PR from the rest of this change, for Git hustory reasons

smklein · 2024-10-22T19:08:43Z

sled-agent/src/instance_manager.rs

+            let (tx, rx) = oneshot::channel();
+            // This will fail if the tx has been dropped, which we just...don't do.
+            let _ = instance.get_filesystem_zpool(tx);
+            let Ok(Ok(Some(filesystem_pool))) = rx.await else {


Here and in "instance.terminate" below are two spots where we still be seem to block on a per-instance basis.

I don't think that needs to block this change -- it's still a move in the right direction -- but "avoid .await-ing on a single instance" seems like necessary follow-up, otherwise we could still hit the #6911 issue, right?

(I see you have the TODO here, but just confirming that this also applies to .terminate)

Yeah, that's correct. I was planning to do that in another branch, but I'm happy to do all of that here, if you prefer?

Happy to punt it to another PR, if we're doing it somewhere!

sled-agent/src/instance.rs

gjcolombo · 2024-10-22T20:37:17Z

sled-agent/src/instance_manager.rs

+            // TODO(eliza): clone each instance and spawn a task to handle it,
+            // so that a single misbehaving instance cannot block the instance
+            // manager's run loop...


This might be a better comment on the follow-up PR, but I'm having trouble envisioning how this will work. Today this routine won't return until it's disposed of every VMM in jobs. Delegating to individual tasks will allow this loop to move on to consider VMM Y even if requests to VMM X aren't currently being serviced, but won't this routine still need to join all of the tasks at the end?

Not a blocker for this PR; just trying to dispel the "what am I missing?" feeling :)

I was imagining that the instances that must be disposed of would be cloned into a task that would also own the request and it would just go and do the whole thing and then complete the request when it finishes, so that instances which don't need to be destroyed can go on handling requests to do other stuff whilst the disk expungement is being handled. This is in contrast to forcing requests to instances which know nothing about the expunged disk to wait until the expungement has finished. Does that make sense?

It does, thanks!

The bit I was headed toward is that this ultimately means it will be possible to have an instance manager servicing two separate "use only these disks" commands at once, which is new behavior. That may be fine (I haven't thought about it) but I wanted to make sure I understood what was happening before actually raising it as something worth considering.

hawkw · 2024-10-23T21:10:00Z

Ran omicron-stress against this branch (c6035fd) for about a minute and a half and eventually hit this:

2024-10-23T21:01:34.795972Z ERROR omicron_stress: 202: invalid actor state
2024-10-23T21:01:34.796049Z  INFO omicron_stress: 218: Halting actors

which...I'll have to go look into what that actually means --- I'm not sure if that's indicative of an internal bug in omicron-stress or something changed in this branch.

hawkw · 2024-10-23T21:14:04Z

Incidentally, the CI failure on c6035fdd788766e6762799aa812d9ac1eab02be6 sure looks like the same Oximeter re-registration issue as #6895/#6916, although it looks like it happened while running a different test: https://buildomat.eng.oxide.computer/wg/0/details/01JAXAGXWPE822857D5D1DP0N4/DMz6rNUPg3AtDtKClnW7rWsxzHSSFztK1IEKxJe1w21To9FJ/01JAXAH9TM5NJ0PDYF6RSJXV5H#S5672

I'm going to re-run that and figure out if it's sticky.

hawkw · 2024-10-23T21:21:38Z

Ran omicron-stress against this branch (c6035fd) for about a minute and a half and eventually hit this:
2024-10-23T21:01:34.795972Z ERROR omicron_stress: 202: invalid actor state
2024-10-23T21:01:34.796049Z  INFO omicron_stress: 218: Halting actors
which...I'll have to go look into what that actually means --- I'm not sure if that's indicative of an internal bug in omicron-stress or something changed in this branch.

ah, cool, it looks like "invalid actor state" has a more detailed error message string inside it, which the Display impl just...eats: https://github.com/oxidecomputer/omicron-stress/blob/8cf99c4520cc861f89f6bcb15a5944c2c7fcc581/src/actor/mod.rs#L52-L53

I'll fix that and hopefully it'll tell us what's actually going on.

hawkw · 2024-10-23T21:29:50Z

aha:

2024-10-23T21:25:17.983062Z ERROR omicron_stress: 202: invalid actor state: instance inst1 unexpectedly in state Failed
2024-10-23T21:25:17.983160Z  INFO omicron_stress: 218: Halting actors

and if I grep for everything that happened to inst1, the last bit immediately before the "invalid state" looks like this:

2024-10-23T21:25:08.354247Z  INFO actor{name="inst1_0"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 167: sending instance stop request
2024-10-23T21:25:08.571988Z  INFO actor{name="inst1_7"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 148: sending instance start request
2024-10-23T21:25:11.124327Z  INFO actor{name="inst1_4"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 128: sending instance create request body=InstanceCreate { description: "inst1", disks: [], external_ips: [], hostname: Hostname("inst1"), memory: ByteCount(1073741824), name: Name("inst1"), ncpus: InstanceCpuCount(1), network_interfaces: None, ssh_public_keys: None, start: true, user_data: "" }
2024-10-23T21:25:12.029219Z  WARN actor{name="inst1_2"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 196: instance delete request returned result=Err(Error Response: status: 400 Bad Request; headers: {"content-type": "application/json", "x-request-id": "fcc1c2cb-d1dd-45c4-947a-9f43ab5a1894", "content-length": "181", "date": "Wed, 23 Oct 2024 21:25:11 GMT"}; value: Error { error_code: Some("InvalidRequest"), message: "cannot delete instance: instance is running or has not yet fully stopped", request_id: "fcc1c2cb-d1dd-45c4-947a-9f43ab5a1894" })
2024-10-23T21:25:12.320203Z  INFO actor{name="inst1_7"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 160: instance start request returned result=Ok(Instance { description: "inst1", hostname: "inst1", id: b9f4d940-0e7f-4f59-8293-4d5bc5360898, memory: ByteCount(1073741824), name: Name("inst1"), ncpus: InstanceCpuCount(1), project_id: 74debae8-1eb6-46b9-b5a1-2620c0c08301, run_state: Starting, time_created: 2024-10-23T21:23:25.173472Z, time_modified: 2024-10-23T21:23:25.173472Z, time_run_state_updated: 2024-10-23T21:25:02.436754Z })
2024-10-23T21:25:13.536644Z  INFO actor{name="inst1_3"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 167: sending instance stop request
2024-10-23T21:25:15.316398Z  INFO actor{name="inst1_6"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 179: instance stop request returned result=Ok(Instance { description: "inst1", hostname: "inst1", id: b9f4d940-0e7f-4f59-8293-4d5bc5360898, memory: ByteCount(1073741824), name: Name("inst1"), ncpus: InstanceCpuCount(1), project_id: 74debae8-1eb6-46b9-b5a1-2620c0c08301, run_state: Starting, time_created: 2024-10-23T21:23:25.173472Z, time_modified: 2024-10-23T21:23:25.173472Z, time_run_state_updated: 2024-10-23T21:25:13.483238Z })
2024-10-23T21:25:15.406186Z  WARN actor{name="inst1_1"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 196: instance delete request returned result=Err(Error Response: status: 400 Bad Request; headers: {"content-type": "application/json", "x-request-id": "6f8c02bf-a32b-45ca-bca0-36ee5f1045ab", "content-length": "181", "date": "Wed, 23 Oct 2024 21:25:14 GMT"}; value: Error { error_code: Some("InvalidRequest"), message: "cannot delete instance: instance is running or has not yet fully stopped", request_id: "6f8c02bf-a32b-45ca-bca0-36ee5f1045ab" })
2024-10-23T21:25:16.219577Z  INFO actor{name="inst1_7"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 128: sending instance create request body=InstanceCreate { description: "inst1", disks: [], external_ips: [], hostname: Hostname("inst1"), memory: ByteCount(1073741824), name: Name("inst1"), ncpus: InstanceCpuCount(1), network_interfaces: None, ssh_public_keys: None, start: true, user_data: "" }
2024-10-23T21:25:17.679867Z  INFO actor{name="inst1_5"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 186: sending instance delete request
2024-10-23T21:25:17.948918Z  INFO actor{name="inst1_0"}:antagonize{instance_name="inst1"}: omicron_stress::actor::instance: 179: instance stop request returned result=Ok(Instance { description: "inst1", hostname: "inst1", id: b9f4d940-0e7f-4f59-8293-4d5bc5360898, memory: ByteCount(1073741824), name: Name("inst1"), ncpus: InstanceCpuCount(1), project_id: 74debae8-1eb6-46b9-b5a1-2620c0c08301, run_state: Failed, time_created: 2024-10-23T21:23:25.173472Z, time_modified: 2024-10-23T21:23:25.173472Z, time_run_state_updated: 2024-10-23T21:25:17.431056Z })
2024-10-23T21:25:17.983062Z ERROR omicron_stress: 202: invalid actor state: instance inst1 unexpectedly in state Failed
2024-10-23T21:25:17.983503Z  INFO actor{name="inst1_0"}: omicron_stress::actor: 176: sending halt request
2024-10-23T21:25:17.983535Z  INFO actor{name="inst1_1"}: omicron_stress::actor: 176: sending halt request
2024-10-23T21:25:17.983565Z  INFO actor{name="inst1_2"}: omicron_stress::actor: 176: sending halt request
2024-10-23T21:25:17.983596Z  INFO actor{name="inst1_3"}: omicron_stress::actor: 176: sending halt request
2024-10-23T21:25:17.983651Z  INFO actor{name="inst1_4"}: omicron_stress::actor: 176: sending halt request
2024-10-23T21:25:17.983769Z  INFO actor{name="inst1_5"}: omicron_stress::actor: 176: sending halt request
2024-10-23T21:25:17.983885Z  INFO actor{name="inst1_6"}: omicron_stress::actor: 176: sending halt request
2024-10-23T21:25:17.983961Z  INFO actor{name="inst1_7"}: omicron_stress::actor: 176: sending halt request

I need to think through what this actually means...

hawkw · 2024-10-24T16:56:02Z

aha:

2024-10-23T21:25:17.983062Z ERROR omicron_stress: 202: invalid actor state: instance inst1 unexpectedly in state Failed
2024-10-23T21:25:17.983160Z  INFO omicron_stress: 218: Halting actors

...

Okay, update on this, looks like instances are also going to Failed on the latest main commit from yesterday (0f8db77), so that's not related to this branch. I'm guessing that's related to the crucible-upstairs panic that @augustuswm and I observed on the colo rack yesterday, but I'll try and see if there are any cores sitting around after the stress run to confirm that. In any case, it doesn't seem related to this change, so that's good.

hawkw · 2024-10-24T16:59:10Z

Incidentally, the CI failure on c6035fdd788766e6762799aa812d9ac1eab02be6 sure looks like the same Oximeter re-registration issue as #6895/#6916, although it looks like it happened while running a different test: https://buildomat.eng.oxide.computer/wg/0/details/01JAXAGXWPE822857D5D1DP0N4/DMz6rNUPg3AtDtKClnW7rWsxzHSSFztK1IEKxJe1w21To9FJ/01JAXAH9TM5NJ0PDYF6RSJXV5H#S5672

I'm going to re-run that and figure out if it's sticky.

Hahahahha, when I reran that job, it failed with a similar Oximeter timeout in another Oximeter test (integration_tests::oximeter::test_oximeter_database_records): https://buildomat.eng.oxide.computer/wg/0/details/01JAXAGXWPE822857D5D1DP0N4/DMz6rNUPg3AtDtKClnW7rWsxzHSSFztK1IEKxJe1w21To9FJ/01JAXKSQHRAC3MSG0YH2DBMECP#S5660

I'm still pretty sure that's not my fault...

hawkw · 2024-10-24T19:08:11Z

Argh, an entirely new CI failure appears: https://github.com/oxidecomputer/omicron/pull/6913/checks?check_run_id=32025709910

I really hope this one isn't my fault, but will have to dig a bit to find out:
https://buildomat.eng.oxide.computer/wg/0/details/01JAZVE3ECZRGC5PTJZJB1KJ2N/hh7z1TRG37lRfhcfB0Fy098oj4khSbUKkzckUQR55kd8d4jk/01JAZVEN0ZV92CQJMAQ6X6RX6N#S6429

hawkw · 2024-10-24T21:31:48Z

AHAHAHAHHAHAHA cool I just saw a new Oximeter related test flake: https://buildomat.eng.oxide.computer/wg/0/details/01JAZVE3ECZRGC5PTJZJB1KJ2N/hh7z1TRG37lRfhcfB0Fy098oj4khSbUKkzckUQR55kd8d4jk/01JAZVEN0WFKG1KABT2YD0M74P#S6000

Stacked on top of #6913 Presently, sled-agent sends requests to terminate an instance to the `InstanceRunner` task over the same `tokio::sync::mpsc` request channel as all other requests sent to that instance. This means that the `InstanceRunner` will attempt to terminate the instance only once other requests received before the termination request have been processed, and an instance cannot be terminated if its request channel has filled up. Similarly, if an instance's `InstanceRunner` task is waiting for an in-flight request to the VMM to complete, the request to terminate the instance will not be seen until the current request to Propolis has returned. This means that if the instance has gotten stuck for some reason --- e.g., because it is attempting a Crucible snapshot that cannot complete because a physical disk has gone missing, as seen in #6911 --- the instance cannot be terminated. Sadly, in this case, the only way to resolve the stuck request is to terminate the instance, but we cannot do so *because* the instance is stuck. This seems unfortunate: Ii we try to kill an instance because it's doing something that it will never be able to finish, it shouldn't be able to say "no, you can't kill me, I'm too *busy* to die!". Instead, requests to terminate the instance should be prioritized over other requests. This commit does that. Rather than sending termination requests to the `InstanceRunner` over the same channel as all other requests, we instead introduce a separate channel that's *just* for termination requests, which is preferred over the request channel in the biased `tokio::select!` in the `InstanceRunner` run loop. This means that a full request channel cannot stop a termination request from being sent. When a request to the VMM is in flight, the future that awaits that request's completion is now one branch of a similar `tokio::select!` with the termination channel. This way, if a termination request comes in while the `InstanceRunner` is awaiting an in-flight instance operation, it will still be notified immediately of the termination request, cancel whatever operation it's waiting for, and go ahead and terminate the VMM immediately. This is the correct behavior here, since the terminate operation is intended to forcefully terminate the VMM *now*, and is used internally for purposes such as `use_only_these_disks` killing instances that are using a no-longer-extant disk, or the control plane requesting that the sled-agent forcibly unregister the instance. "Normal" requests to stop the instance gracefully will go through the `instance_put_state` API instead, sending requests through the normal request channel and allowing in flight operations to complete.

hawkw requested review from gjcolombo and smklein October 22, 2024 18:52

hawkw self-assigned this Oct 22, 2024

shut up clippy

930e764

hawkw commented Oct 22, 2024

View reviewed changes

smklein approved these changes Oct 22, 2024

View reviewed changes

hawkw mentioned this pull request Oct 22, 2024

[sled-agent] Give InstanceManager a gun #6915

Merged

gjcolombo approved these changes Oct 22, 2024

View reviewed changes

fix helios tests now that put_state is not async

c6035fd

hawkw mentioned this pull request Oct 23, 2024

test failed in CI: integration_tests::metrics::test_metrics #6895

Closed

hawkw mentioned this pull request Oct 23, 2024

Make invalid states (more) representable oxidecomputer/omicron-stress#10

Merged

Update sled-agent/src/instance.rs

4c35c3b

hawkw enabled auto-merge (squash) October 24, 2024 18:08

hawkw merged commit 78ee981 into main Oct 25, 2024
16 checks passed

hawkw deleted the eliza/instance-mangling branch October 25, 2024 17:22

hawkw mentioned this pull request Oct 30, 2024

sled-agent not serving vmm requests #6911

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sled-agent] Don't block InstanceManager on full MPSCs #6913

[sled-agent] Don't block InstanceManager on full MPSCs #6913

hawkw commented Oct 22, 2024

hawkw commented Oct 22, 2024

hawkw commented Oct 22, 2024

hawkw Oct 22, 2024

smklein Oct 22, 2024

hawkw Oct 22, 2024

smklein Oct 22, 2024

hawkw Oct 22, 2024

smklein Oct 22, 2024

gjcolombo Oct 22, 2024

hawkw Oct 22, 2024

gjcolombo Oct 22, 2024

hawkw commented Oct 23, 2024

hawkw commented Oct 23, 2024

hawkw commented Oct 23, 2024

hawkw commented Oct 23, 2024

hawkw commented Oct 24, 2024

hawkw commented Oct 24, 2024

hawkw commented Oct 24, 2024

hawkw commented Oct 24, 2024

[sled-agent] Don't block InstanceManager on full MPSCs #6913

[sled-agent] Don't block InstanceManager on full MPSCs #6913

Conversation

hawkw commented Oct 22, 2024

Footnotes

hawkw commented Oct 22, 2024

hawkw commented Oct 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hawkw commented Oct 23, 2024

hawkw commented Oct 23, 2024

hawkw commented Oct 23, 2024

hawkw commented Oct 23, 2024

hawkw commented Oct 24, 2024

hawkw commented Oct 24, 2024

hawkw commented Oct 24, 2024

hawkw commented Oct 24, 2024