-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify microvm builder code and make uffd test failures easier to debug #4991
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4991 +/- ##
==========================================
- Coverage 83.07% 83.06% -0.01%
==========================================
Files 244 244
Lines 26634 26658 +24
==========================================
+ Hits 22125 22144 +19
- Misses 4509 4514 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
60e6a40
to
8e83347
Compare
ce01d8a
to
b0eecf3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some questions regarding backwards compatibility of the swagger changes. The rest of comments are mostly nits.
Also, codecov is complaining about the new functions that help us handle the messiness of static CPU templates. Anything we can do about these?
f56bf06
to
b7e03e5
Compare
b7bf920
to
720a08d
Compare
We never make us of this, and I do not see where this ever _could_ be useful. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
This makes these fields optional in PATCH /machine-config requests. The comment on this structure says that all fields should be optional, and I dont quite see why these two should be different. Thus, add `serde(default)` to avoid forcing customers to explicitly set them to `null` if they do not want to update these parts of the machine config. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
This struct is almost a 1:1 duplication of `struct MachineConfig`, with only a single deviation when it comes to CPU template handling (see below). This makes it very annoying to add new fields to the /machine-config endpoint, because counter-intuitively we have to hand-edit at least 3 structs to add new fields to. We can get rid of this duplication by merging VmConfig and MachineConfig into just `MachineConfig` (that's what the endpoint is called, so having the struct be the same makes sense). We now need to handle a bit of nasty-ness when it comes to CPU templates, because /machine-config can only be used for specifying static cpu templates, while a VmConfig is used to hold whatever CPU template is stored, in whatever way. However, we can handle this at the serde layer, by making serialize/deserialize ignore the field if it doesnt contain a static template. This is ugly, but since static templates are deprecated, we have line of sight to getting rid of this weirdness when we release 2.0. While we're at it, opportunistically rename functions etc to uniformly call this thing a "machine config" instead of a "vm config". Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
If the UFFD handler exits abnormaly for some reason, have it take down Firecracker as well by SIGKILL-ing it from a panic hook. For this, reintroduce the "get peer creds" logic. We have to use SIGKILL because Firecracker could be inside the handler for a KVM-originated page fault that is not marked as interruptible, in which case all signals but SIGKILL are ignored (happens for example during KVM_SET_MSRS when it triggers the initialization of a gfn_to_pfn_cache for the kvm-clock page, which uses GUP without FOLL_INTERRUPTIBLE). While we're at it, add a hint to the generic "process not found" error message to indicate that potentially Firecracker died, and that the cause of this could be the UFFD handler crashing (for example, in firecracker-microvm#4601 the cause of the mystery hang is the UFFD handler crashing, but we were stumped by what's going on for over half a year. Let's avoid that going forward). We can't enable this by default because it interferes with unittests, and also the "malicious_handler", so expose a function on `Runtime` to enable it only in valid_handler and fault_all_handler. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
There is no need to clone the GuestMemoryMmap here, as create_vmm_and_vcpus returns it again (as part of the Vmm object), and since later code in build_microvm_from_snapshot doesn't need to take ownership of the GuestMemoryMmap, we can just use references to this stored object, avoiding the clone. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
720a08d
to
4131a3d
Compare
This PR is an assortment of 5 small changes that are only connected by the fact that I ran into 4 of them while playing around with guest_memfd in Firecracker (the odd one out is the UFFD one, which is related to #4601). Roughly, we can group these changes into 3 categories:
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
tools/devtool checkstyle
to verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md
.Runbook for Firecracker API changes.
integration tests.
TODO
.rust-vmm
.