Skip to content

Conversation

@hawkw
Copy link
Member

@hawkw hawkw commented Jul 16, 2025

As described in RFD 579, we are now recommending certain non-default Tokio runtime configurations in all software we deploy in production. The oxide-tokio-rt crate provides a way to easily get these common recommended configurations by using its functions to initialize the runtime rather than #[tokio::main]

Presently, oxide-tokio-rt does the following:

But, we anticipate that if there are additional Tokio configurations which we want to recommend in the future, we'll add them here as well.

This commit updates maghemite to initialize the Tokio runtime using oxide-tokio-rt. I've added a Clippy config for warning on uses of #[tokio::main], as well, to avoid forgetting to use oxide-tokio-rt if new binaries are added. All the configs we set currently in oxide-tokio-rt requires the tokio_unstable RUSTFLAGS cfg, but it looks like that's already being set, so that shouldn't be a problem.

I've changed every binary to use oxide-tokio-rt with the exception of mg-package --- it didn't seem necessary to use the production-like config there and maybe avoiding an additional dependency to fetch would make it build slightly faster, although I'm unconvinced. I did change the lab binaries to use oxide-tokio-rt, as I figured that the DTrace probes could come in handy there.

As described in [RFD 579], we are now recommending certain non-default
Tokio runtime configurations in all software we deploy in production.
The [`oxide-tokio-rt`] crate provides a way to easily get these common
recommended configurations by using its functions to initialize the
runtime rather than `#[tokio::main]`

Presently, `oxide-tokio-rt` does the following:

- Enables DTrace probes for Tokio runtime events using [`tokio-dtrace`]
- Disables the [LIFO slot optimization][lifo], which was the root cause
  of omicron#8334

But, we anticipate that if there are additional Tokio configurations
which we want to recommend in the future, we'll add them here as well.

In particular, I think the new DTrace probes added by `tokio-dtrace`
will potentially be very useful if issues like the stuck `mgd`
reconciler task we observed in dogfood are ever encountered again in
production.

This commit updates `dendrite` to initialize the Tokio runtime using
`oxide-tokio-rt`. I've added a Clippy config for warning on uses of
`#[tokio::main]`, as well, to avoid forgetting to use `oxide-tokio-rt`
if new binaries are added. All the configs we set currently in
`oxide-tokio-rt` requires the `tokio_unstable` RUSTFLAGS cfg, but it
looks like that's already being set, so that shouldn't be a problem.

I've changed every binary to use `oxide-tokio-rt` with the exception of
`mg-package` --- it didn't seem necessary to use the production-like
config there and maybe avoiding an additional dependency to fetch would
make it build slightly faster, although I'm unconvinced. I did change
the lab binaries to use `oxide-tokio-rt`, as I figured that the DTrace
probes could come in handy there.

[RFD 579]: https://rfd.shared.oxide.computer/rfd/0579
[`oxide-tokio-rt`]: https://github.com/oxidecomputer/oxide-tokio-rt
[`tokio-dtrace`]: https://github.com/oxidecomputer/tokio-dtrace
[lifo]:
    https://rfd.shared.oxide.computer/rfd/0579#_disabling_the_lifo_slot_optimization
@hawkw hawkw requested a review from rcgoodfellow July 16, 2025 22:22
@hawkw
Copy link
Member Author

hawkw commented Jul 16, 2025

CI failure looks unrelated: we timed out installing a package. I'm restarting that one.

@rcgoodfellow
Copy link
Collaborator

In particular, I think the new DTrace probes added by tokio-dtrace will potentially be very useful if issues like the stuck mgd reconciler task we observed in dogfood are ever encountered again in production.

Is this referring to the Dendrite dpd NAT reconciler getting stuck? If there is something maghemite related getting stuck, that's something I'd like to know more about.

@hawkw
Copy link
Member Author

hawkw commented Jul 17, 2025

In particular, I think the new DTrace probes added by tokio-dtrace will potentially be very useful if issues like the stuck mgd reconciler task we observed in dogfood are ever encountered again in production.

Is this referring to the Dendrite dpd NAT reconciler getting stuck? If there is something maghemite related getting stuck, that's something I'd like to know more about.

Ah, whoops, you're right, that was Dendrite, never mind. Fortunately, I've also opened a similar PR over there.

@hawkw hawkw merged commit 666ccc9 into main Jul 17, 2025
13 checks passed
@hawkw hawkw deleted the eliza/oxide-tokio-rt branch July 17, 2025 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants