Skip to content
This repository has been archived by the owner on Nov 6, 2024. It is now read-only.

Add serde support for bindings commonly used in migration/snapshotting #101

Merged
merged 7 commits into from
Apr 16, 2024

Conversation

roypat
Copy link
Collaborator

@roypat roypat commented Mar 6, 2024

On crates.io, kvm-bindings has ~1.1M less downloads than kvm-ioctls, even though it is a non-optional dependency of the latter. The reason for this is that most project using kvm-ioctls do not actually depend rust-vmm/kvm-bindings, but instead use crates-io.patch directives in Cargo.toml to overwrite kvm-ioctl's dependency on kvm-bindings with a local fork. These forks usually serve a common need: supporting the (de)serialization of various structures define in kvm-bindings to allow for VM migration/snapshotting. As a result from this, these forks are also all nearly identical, making inclusion of the functionality upstream an obvious idea. Maintaining these forks also adds quite a bit of overhead for downstream rust-vmm users when it comes to consuming kvm-bindings updates, as all serialization-related functionality has to be rebased on top of upstream (which was particularly painful for the recent upgrade to 6.5 bindings).

There has been some discussion about including (de)serialization support in upstream rust-vmm a few years ago (#25, #24), with the conclusion as I understand it being that people were open to having a serialization solution upstream, but this seemingly never manifested. In this RFC, I am hoping to pick this thread up again.

The main difference between this PR and the old discussions is that this PR is serde-based instead of versionize-based. This is because all hypervisor projects I am aware of have switched to using serde for their migration/snapshotting needs (Firecracker itself switched earlier this year). There were also concerns about not wanting to take dependencies on crates outside the rust-vmm organization (as versionize is maintained by the Firecracker team), which are also alleviated by using serde (which is the de-facto industry standard for serialization in Rust).

This PR implements/derives Serialize and Deserialize for the minimum set of items needed by Firecracker and CloudHypervisor (e.g. both Firecracker and CloudHypervisor compile if their respective forks are replaced with this branch, both on x86_64 and aarch64). These impls are hidden behind a with-serde feature flag (name chosen to match the corresponding flag in vmm-sys-util). Thanks to how cargo's feature resolver works, we will not need to "feed" this flag through to kvm-ioctls or other reverse-dependencies. If a downstream crate enables the with-serde feature, it will get enabled globally (meaning crates that enable the feature can interoperate with those that do not).

Open Questions

There are a handful of open question that still need some thought before we can move forward with this:

  • This PR adds serde support for a minimal set of items that was derived from Firecracker/CloudHypervisor needs. This might not match the needs of other users (and we might end up seeing a lot of PRs adding serde support for additional items). Should we try to add derives for more items from the get-go? We only add support for a minimal set needed by FC/CH. If needed, more implementations can be added in patch releases
  • This PR modifies bindgen-generated code, meaning the derives for (De)Serialize would have to be manually readded every time the bindings are regenerated. This makes it easy to accidentally remove implementations. One solution to this would be to have a separate file that statically asserts the serde traits are implements (e.g. fn has_serde<T: Serialize + Deserialize>() {} and then a unit test that just does has_serde::<kvm_regs>() etc for every item we want to have serde support for). Another solution would be to instruct bindgen to automatically derive Serialize/Deserialize for every struct/enum, and have a separate file that contains manual implementations for items the derive fails (e.g. unions). This would also address point 1, although would significantly balloon the set of changes. We have implemented tests that ensure we do not regress on serde implementations.
    • Related to this is the question of whether we want to keep manual implementation of (De)Serializer that are required for various type (such as unions) to live in-line in bindings.rs, or inside a separate module (so that during bindings regeneration, we would really only need to readd derives and nothing more) Manual implementations are in a separate module, together with the tests.

@roypat roypat force-pushed the serde branch 2 times, most recently from 1c1236c to 19d8ad7 Compare March 6, 2024 16:02
@andreeaflorescu
Copy link
Member

This is a great initiative! I think bindgen now supports adding custom derives to structures, so we might be able to add serialize/deserialize when generating the bindings: rust-lang/rust-bindgen#1089. I didn't try it myself to see how it works, but his might solve point 2. I would thus not be very concerned about the manual derives as long as we have tests in separate files (that are not autogenerated), in which we test that we don't regress on serialize/deserialize by mistake.

CC: @rbradford @sboeuf @jiangliu (as you were also involved in the initial discussions). Would this help with removing the need for a local fork for the projects you're working on?

@roypat
Copy link
Collaborator Author

roypat commented Mar 7, 2024

This is a great initiative! I think bindgen now supports adding custom derives to structures, so we might be able to add serialize/deserialize when generating the bindings: rust-lang/rust-bindgen#1089. I didn't try it myself to see how it works, but his might solve point 2. I would thus not be very concerned about the manual derives as long as we have tests in separate files (that are not autogenerated), in which we test that we don't regress on serialize/deserialize by mistake.

Oh, neat, didn't know bindgen had support for that! Although it does seem like it does not support putting those derives behind a feature flag (the cli option --with-derive-custom says "Derive custom traits on any kind of type. The CUSTOM value must be of the shape REGEX=DERIVE where DERIVE is a coma-separated list of derive macros") :(

I guess what would make the process the least painful is some comment in the test being like "if this test fails, make sure all the below structs/etc. have a #[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize)] attribute", and all manual impls moved into some serialize.rs so that they dont get overwritten.

@roypat roypat force-pushed the serde branch 7 times, most recently from 901aae1 to fbbef95 Compare March 7, 2024 11:31
@roypat
Copy link
Collaborator Author

roypat commented Mar 7, 2024

Alright, I've updated the PR to separate manual (De)Serialize implementations into their own serialize.rs files, and also added some compile-time tests to ensure we don't "loose" implementations when we regenerate the bindings. I've also brought the (De)Serialize implementations inline with CloudHypervisor's current fork, based on conversations with @rbradford about their live update needs

@rbradford
Copy link
Contributor

@likebreath Ping! I think your feedback would be good on this!

@roypat
Copy link
Collaborator Author

roypat commented Mar 12, 2024

Summarizing some comments from cloud-hypervisor/cloud-hypervisor#6272 (comment):

CloudHypervisor supports live upgrade across minor versions, so a serialization solution that would be backwards compatible across kernel upgrades would be ideal. One option to achieve this would be to just serialize all kvm-bindings structures as "byte array + len". They're all repr(C), so it would be sound, and the backwards compatibility of Linux's ABI would guarantee us that migrating the same serialized data across bindings generated from different kernel versions would not break:

  • If inside some struct the kernel changes reserved bytes to actually have some meaning, this is completely transparent to a byte-dump of the structure
  • If the kernel adds some fields to the end of a structure, we can detect by comparing the serialized length with size_of::<T>(), and appending zeros
  • If we live-downgrade and to a kvm-bindings version where a field does not yet exist, we can truncate the extra bytes (although this probably would break the guest).

This would also have the advantage that we would not have to add derives to the bindgen generated code, as I envision the (De)Serialize implementations to be generated by a declarative macro defined in serialize.rs. This is not quite accurate, as the bindgen types will still need to implement a trait such as zerocopy::FromBytes.

I will probably have a go at implementing a prototype of this the next few days, but please let me know already if you have any comments/concerns!

/cc @likebreath

@rbradford
Copy link
Contributor

Ideally we would like to support live migration across major versions too - we don't need to be able to support downgrading across versions though. I really like your proposal of treating the bindings (for the purpose of migration) as an opaque byte array which could be bigger at the destination vs the source.

@roypat roypat force-pushed the serde branch 2 times, most recently from b7c9678 to efd34f6 Compare March 14, 2024 10:58
@roypat
Copy link
Collaborator Author

roypat commented Mar 14, 2024

Ideally we would like to support live migration across major versions too - we don't need to be able to support downgrading across versions though. I really like your proposal of treating the bindings (for the purpose of migration) as an opaque byte array which could be bigger at the destination vs the source.

I think with the byte array approach to serialization you'll achieve that goal, however migrating from the current (De)Serialize implementations to the byte-array approach will be a breaking change :(

That being said, I've updated this PR to use zerocopy for the serialization. The changes to bindings.rs are a bit uglier, so I guess we'll definitely need to write some documentation on how to regenerate bindings if we go forward with this.

@rbradford
Copy link
Contributor

Ideally we would like to support live migration across major versions too - we don't need to be able to support downgrading across versions though. I really like your proposal of treating the bindings (for the purpose of migration) as an opaque byte array which could be bigger at the destination vs the source.

I think with the byte array approach to serialization you'll achieve that goal, however migrating from the current (De)Serialize implementations to the byte-array approach will be a breaking change :(

That being said, I've updated this PR to use zerocopy for the serialization. The changes to bindings.rs are a bit uglier, so I guess we'll definitely need to write some documentation on how to regenerate bindings if we go forward with this.

That's fine - we can switch CH to use this on a new major version. We don't advertise support for major version migration because there are still issues like this that need to be shaken out. Thank you very much for tackling this - it will be a massive improvement.

@roypat roypat changed the title [RFC] Add serde support for bindings commonly used in migration/snapshotting Add serde support for bindings commonly used in migration/snapshotting Mar 19, 2024
@roypat
Copy link
Collaborator Author

roypat commented Mar 19, 2024

Hey all,
I've removed the "RFC" tag from the commit title, since I think we've aligned on a decent solution for serialization support (and all the original questions are answered). As such, I consider this PR ready for review :)

Copy link
Contributor

@likebreath likebreath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for driving this improvement. I guess this is the best we can do to have a seamless live upgrade across kernel versions, while I still worry about debugging issues caused by serialization as byte array. Hopefully, the kernel ABI backward compatibility ensures there is rarely such thing happening.

With that, I ran some live migration tests of Cloud Hypervisor with this PR, and it works well. It is also very easy to integrate thanks to the feature gate being used that is identical to what we have.

#[cfg_attr(
feature = "with-serde",
derive(zerocopy::AsBytes, zerocopy::FromBytes, zerocopy::FromZeroes)
)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that the kvm_xsave struct serialization is only good for the first 4K bytes, and VMM needs to maintain a vector of kvm_xsave for migration/upgrade (while relying on conversions to kvm_xsave2 for related KVM ioctl), right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's pretty much the idea. I was planning to add support for KVM_GET_XSAVE2 to kvm_ioctls using the Fam struct added in this PR (based on querying the KVM_CAP_XSAVE2 capability to determine the size of the FAM)

@roypat
Copy link
Collaborator Author

roypat commented Apr 2, 2024

friendly ping for reviewers :)

rbradford
rbradford previously approved these changes Apr 2, 2024
Copy link
Contributor

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@likebreath Did you have a chance to try this against Cloud Hypervisor?

@likebreath
Copy link
Contributor

@likebreath Did you have a chance to try this against Cloud Hypervisor?

Yes, I did. As mentioned in the comment above #101 (review), the switch to this fork was easy and our live migration tests were working fine.

Cargo.toml Outdated Show resolved Hide resolved
src/x86_64/bindings.rs Show resolved Hide resolved
src/x86_64/bindings.rs Show resolved Hide resolved
.buildkite/custom-tests.json Outdated Show resolved Hide resolved
src/x86_64/fam_wrappers.rs Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
Add implementations of `serde::{Serialize, Deserialize}` for
kvm-bindings structures commonly used for live migration/snapshotting.
The selections of structures to add the derives to was determined by
finding the minimal set that allows both Firecracker and CloudHypervisor
to compile against this branch.

Serialization happens as opaque binary blobs via the `zerocopy` crate.
This is to support live migration across different versions of
kvm-bindings (which can potentially be using bindings generated from
different versions of the kernel headers). Since the kernel's ABI is
backward-compatible, the only changes that can happen are:

- "reserved" areas of structures acquire meaning. This is a change
  purely at the representational level, and has no impact on the
  underlying binary representation. Thus, our serialization is
  roboust to these changes (whereas simply deriving the serde traits
  would cause incompatibilities here).

- Adding new fields to the end of a structure. When deserializating, we
  compare the length of the deserialized data with the size of the
  struct, and pad with zeros/truncate as appropriate.

This allows seamless live update.

We pick the `zerocopy` crate for binary serialization as it is already
in use in the rust-vmm ecosystem (in `acpi_tables`), and it has a very
strong security stance (for example, using formal verification in their
CI). The alternative would have been using `ByteValued` from
`vm-memory`, but I do not want `kvm-bindings` to gain a dependency on
`vm-memory` (especially since `vm-memory` brings in a lot more
functionality that `kvm-bindings` does not need).

Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
roypat added 6 commits April 10, 2024 15:53
Do a build-test of the `with-serde` feature. Also do some simple
round-trip testing to ensure consistent serialization (particularly, for
the items that have manual (De)Serialize implementations).

Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
We already have a `cargo check --all-targets --all-features` with `-D
warnings` in the default CI inherited from rust-vmm-ci, so remove this
duplicate.

Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
In linux 5.17, kvm_xsave got turned into a FamStruct by adding the
flexible "extra" member to its definition. However, unlike all other
such structs, it does not contain a "length" field. Instead, the length
of the flexible array member has to be determined by querying the
`KVM_CAP_XSAVE2` capability. This requires access to a VM file
descriptor, and thus cannot happen in the `FamStruct::len` trait method.
To work around this, define a wrapper struct that caches the length of a
previous `KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2)` call, and implement
`FamStruct` for this wrapper. Then in kvm-ioctls, we can expose a
function that first query `KVM_CAP_XSAVE2`, then invokes
`KVM_GET_XSAVE2` to retrives the `kvm_xsave` structure, and then combine
them into the below `kvm_xsave2` structure to be managed as a
`FamStruct`.

Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
Describe what needs to be done to not break serialization support.

Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
The contributing file contains instructions for how to (re)generate
bindings, but this is a somewhat unintuitive location for these
instructions. Thus explicitly refer to it from the README.

Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
@roypat roypat requested a review from rbradford April 11, 2024 13:04
@roypat
Copy link
Collaborator Author

roypat commented Apr 11, 2024

(no functional changes since the last revision you reviewed, Rob, only documentation updates)

@rbradford
Copy link
Contributor

@roypat Thanks so much for working on this - we're going to move CH over to use this soon!

Copy link

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, this is solving a long haul issue 👍

@roypat roypat merged commit 11139a6 into rust-vmm:main Apr 16, 2024
2 checks passed
@roypat roypat deleted the serde branch April 26, 2024 07:57
@roypat roypat mentioned this pull request May 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants