-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MGS: Finish extracting SP communications to its own crate #786
Conversation
// same MGS instance - we have two, and local mutexes won't help if both | ||
// are trying to send to the same SP simultaneously. maybe we need some | ||
// kind of time-limited lock from the SP side where it discards any | ||
// incoming data from other sources? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note - this comment refers to a behavior change compared to what was in the old version of the code. Open to suggestions about what to do here; I suspect the relatively naive mutex solution I had gave the illusion of addressing a problem without really addressing it. It solves "two simultaneous POSTs to a single SP's console via the same MGS instance", but doesn't address simultaneous requests to two MGS instances, or any of the variety of failure modes (e.g., we time out in the middle of a stream of UDP packets).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to enforce mutual exclusion, the SP would need to play a role in it. One could imagine a few different mechanisms. For example, the SP could issue a console token (time-limited, as you say) that a client (presumably nexus or something else talking to one or either MGS) could use to permit console writes. Alternatively, the SP could issue a time-limited token to an MGS and clients would need to remember to route traffic through the proper MGS... and MGS would need to enforce mutual exclusion between clients.
I think any of that would be overkill. The serial console is really an interface of last resort. Until we've made some progress with RFD 247 (and we should! soon!), I'd argue that we should take the simplest path which is to provide no mechanism for mutual exclusion whatsoever. My guess is that 247 will land at a spot where that is sufficient for MVP
92e5a5c
to
95d8c48
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great; sorry for the delay!
// same MGS instance - we have two, and local mutexes won't help if both | ||
// are trying to send to the same SP simultaneously. maybe we need some | ||
// kind of time-limited lock from the SP side where it discards any | ||
// incoming data from other sources? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to enforce mutual exclusion, the SP would need to play a role in it. One could imagine a few different mechanisms. For example, the SP could issue a console token (time-limited, as you say) that a client (presumably nexus or something else talking to one or either MGS) could use to permit console writes. Alternatively, the SP could issue a time-limited token to an MGS and clients would need to remember to route traffic through the proper MGS... and MGS would need to enforce mutual exclusion between clients.
I think any of that would be overkill. The serial console is really an interface of last resort. Until we've made some progress with RFD 247 (and we should! soon!), I'd argue that we should take the simplest path which is to provide no mechanism for mutual exclusion whatsoever. My guess is that 247 will land at a spot where that is sufficient for MVP
// `version` is intentionally the first 4 bytes of the packet; we | ||
// could check it before trying to deserialize? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah; I think we might kick ourselves if we don't..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in 93b27cd
// decide whether this is a response to an outstanding request or an | ||
// unprompted message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking out loud: we're going to have other unprompted messages e.g. for errors. While we're content (I think) to poll for updates to the serial console, we'll need to figure out how we forward errors along. Somehow MGS needs to know how to poke Nexus and it seems like MGS will need some ability to register a handler like response handlers but effectively bound to a message class rather than to a request_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds right to me. MGS needs to be up even if nexus is down, which makes "MGS doesn't know (explicitly) about nexus but allows callers to register for messages" seem correct. I've been assuming MGS will only keep its relevant state in memory, which means a restart would require callers to need to reregister. Does that seem reasonable?
// we don't care to check the return value here; this will be `Some(_)` | ||
// if we're being dropped before a response has been received and | ||
// forwarded on to our caller, and `None` if not (because a caller will | ||
// have already extracted the sending channel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be valuable to leave behind some contents of requests
in this case that we could reap later? The ostensible benefit would be that we could distinguish a timeout from a logic error. Or would this potentially add more work than it saves?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what we would leave behind, and/or I'm not sure how we would distinguish between a timeout and a logic error that manifested as an undelivered message (i.e., a timeout). Did you have something specific in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delayed reply. I was imagining changing the type of requests
to something like Mutex<HashMap<K, MaybeTimedOut<Sender<T>>>>
and then in drop
changing it to MaybeTimedOut::TimedOut(SystemTime::now)
... that said, it sounds like a pain in the ass to then reap etc. Sorry for even saying anything... :-)
Crucible changes Add test and fix for replay race condition (#1519) Fix clippy warnings (#1517) Add edition to `crucible-workspace-hack` (#1516) Split out Downstairs-specific stats (#1511) Move remaining `GuestWork` functionality into `Downstairs` (#1510) Track both jobs and bytes in each IO state (#1507) Fix new `rustc` and `clippy` warnings (#1509) Remove IOP/BW limits (for now) (#1506) Move `GuestBlockRes` into `DownstairsIO` (#1502) Update actions/checkout digest to eef6144 (#1499) Update Rust crate hyper-staticfile to 0.10.1 (#1411) Turn off test-up-2region-encrypted.sh (#1504) Add `IOop::Barrier` (#1494) Fix IPv6 addresses in `crutest` (#1503) Add region set options to more tests. (#1496) Simplify `CompleteJobs` (#1493) Removed ignored CI jobs (#1497) Minor cleanups to `print_last_completed` (#1501) Remove remaining `Arc<Volume>` instances (#1500) Add `VolumeBuilder` type (#1492) remove old unused scripts (#1495) More multiple region support. (#1484) Simplify matches (#1490) Move complete job tracker to a helper object (#1489) Expand summary and add documentation references to the README. (#1486) Remove `GuestWorkId` (2/2) (#1482) Remove `JobId` from `DownstairsIO` (1/2) (#1481) Remove unused `#[derive(..)]` (#1483) Update more tests to use dsc (#1480) Crutest now Volume only (#1479) Propolis changes manually impl Deserialize for PciPath for validation purposes (#801) phd: gate OS-specific tests, make others more OS-agnostic (#799) lib: log vCPU diagnostics on triple fault and for some unhandled exit types (#795) add marker trait to help check safety of guest memory reads (#794) clippy fixes for 1.82 (#796) lib: move cpuid::Set to cpuid_utils; prevent semantic subleaf conflicts (#782) PHD: write efivars in one go (#786) PHD: support guest-initiated reboot (#785) server: accept CPUID values in instance specs and plumb them to bhyve (#780) PHD: allow patched Crucible dependencies (#778) server: add a first-class error type to machine init (#777) PciPath to Bdf conversion is infallible; prove it and refactor (#774) instance spec rework: flatten InstanceSpecV0 (#767) Make PUT /instance/state 503 when waiting to init Less anxiety-inducing `Vm::{get, state_watcher}`
Crucible changes Add test and fix for replay race condition (#1519) Fix clippy warnings (#1517) Add edition to `crucible-workspace-hack` (#1516) Split out Downstairs-specific stats (#1511) Move remaining `GuestWork` functionality into `Downstairs` (#1510) Track both jobs and bytes in each IO state (#1507) Fix new `rustc` and `clippy` warnings (#1509) Remove IOP/BW limits (for now) (#1506) Move `GuestBlockRes` into `DownstairsIO` (#1502) Update actions/checkout digest to eef6144 (#1499) Update Rust crate hyper-staticfile to 0.10.1 (#1411) Turn off test-up-2region-encrypted.sh (#1504) Add `IOop::Barrier` (#1494) Fix IPv6 addresses in `crutest` (#1503) Add region set options to more tests. (#1496) Simplify `CompleteJobs` (#1493) Removed ignored CI jobs (#1497) Minor cleanups to `print_last_completed` (#1501) Remove remaining `Arc<Volume>` instances (#1500) Add `VolumeBuilder` type (#1492) remove old unused scripts (#1495) More multiple region support. (#1484) Simplify matches (#1490) Move complete job tracker to a helper object (#1489) Expand summary and add documentation references to the README. (#1486) Remove `GuestWorkId` (2/2) (#1482) Remove `JobId` from `DownstairsIO` (1/2) (#1481) Remove unused `#[derive(..)]` (#1483) Update more tests to use dsc (#1480) Crutest now Volume only (#1479) Propolis changes manually impl Deserialize for PciPath for validation purposes (#801) phd: gate OS-specific tests, make others more OS-agnostic (#799) lib: log vCPU diagnostics on triple fault and for some unhandled exit types (#795) add marker trait to help check safety of guest memory reads (#794) clippy fixes for 1.82 (#796) lib: move cpuid::Set to cpuid_utils; prevent semantic subleaf conflicts (#782) PHD: write efivars in one go (#786) PHD: support guest-initiated reboot (#785) server: accept CPUID values in instance specs and plumb them to bhyve (#780) PHD: allow patched Crucible dependencies (#778) server: add a first-class error type to machine init (#777) PciPath to Bdf conversion is infallible; prove it and refactor (#774) instance spec rework: flatten InstanceSpecV0 (#767) Make PUT /instance/state 503 when waiting to init Less anxiety-inducing `Vm::{get, state_watcher}` --------- Co-authored-by: Alan Hanson <alan@oxide.computer>
All the raw UDP / management switch logic now lives in
gateway-sp-comms
, on whichgateway
depends. This PR includes some miscellaneous cleanup (e.g., the task for receiving incoming packets is somewhat simpler and has one fewerArc
, and there are a handful of new unit tests), but no new functionality.