Reference VMM MVP #11

alxiord · 2020-09-08T11:08:53Z

This PR introduces the first iteration of a rust-vmm-based reference VMM.

lauralt · 2020-09-08T13:45:56Z

README.md

+cargo run                       \
+    --guest-memory=1024         \
+    --vcpus=1                   \
+    --kernel=/path/to/vmlinux   \
+    --cmdline="cmdline"


This should be updated either here or in #1 to its current form.

src/vmm/src/vcpu/mod.rs

andreeaflorescu

We should not add binary blobs to GitHub:

vmlinux-hello-busybox
vmlinux-hello-busybox-halt

alxiord · 2020-09-09T15:39:54Z

We should not add binary blobs to GitHub:
* vmlinux-hello-busybox

* vmlinux-hello-busybox-halt

Replaced with a post-checkout hook that builds the 2nd image, which is used in the tests.
The 1st one is for demo purposes, and there's a script in resources/kernel for it.

jdub · 2020-09-10T12:45:36Z

This is incredibly readable and straightforward! 🤘🏻

lauralt

First part of review.

I'm a bit intrigued by the Cargo.toml workspace changes and how they deal with cargo build/cargo run commands. Would really appreciate if someone has more info here.

.buildkite/hooks/post-checkout

README.md

lauralt · 2020-09-11T09:49:12Z

README.md


-...
+```bash
+vmm-reference                                                           \


With the new workspace changes in Cargo.toml it looks like we have to run cargo build --workspace to generate the vmm-reference binary. This should be mentioned somewhere.
Also, I would add here a little Getting started or something like this, in which to mention the exact steps for successfully running a reference vmm.
Moreover, I would add the console in cmdline for user to be able to see the guest output.

Added a doc as the getting started section might get fluffier once we start adding devices and extending platform support. Added the cmdline in the example too.
I reverted the workspace settings to the previous ones (empty [workspace]), but we'll have to fix rust-vmm-ci and update the hook before the tests pass again 😄

lauralt · 2020-09-11T10:21:54Z

tests/test_run_reference_vmm.py

+    # test until the child process ends, which it doesn't. So we have to trust
+    # that the output is there, let the VMM run for a bit, then kill it.
+    # In the future, it will communicate via metrics / devices.
+    vmm_process = subprocess.Popen(vmm_cmd)


Hmmm, are we sure that this test actually runs vmm_cmd successfully? subprocess.Popen seems to have a weird behavior (i.e. it fails, from what I noticed, only if the binary doesn't exist (in this case cargo, which exists)).
Wouldn't it be better to run subprocess.run for example or something else?

Also, funny question: is the cargo run ... option still valid with the workspace changes that I mentioned in another comment? I didn't manage to run it successfully now, but I might be doing something really wrong :-?.

Unlike Popen, subprocess.run doesn't end until the child process ends - so the test would be stuck in it.
As for the workspace stuff, I'm looking into those, it's not clear to me yet how we want to proceed.

From what I manually tested, the things mentioned here: rust-vmm/rust-vmm-ci#38 seem to fix the workspace issue.

I checked again, the test is still passing if, for example, we replace run with blah in vmm_cmd. In this case, the test is not useful. I'll try looking more into how we can make this test fail if the command is invalid.

I think testing cargo blah instead of cargo run tests cargo more than it does the VMM 🤔

It was just an example, replacing --memory with some invalid arg will keep the test passing :(. In my opinion we should make sure that whatever subprocess function we choose, the test is passing only if cargo run ... (or even cargo blah) is successful.

src/vmm/src/vcpu/mod.rs

lauralt

Second part of review.

lauralt · 2020-09-11T12:55:39Z

README.md

+
+## CLI reference
+
+* `memory` - guest memory configurations


I think it would be nice to have a small introduction here.

lauralt · 2020-09-11T13:00:15Z

src/cli/src/lib.rs

+                    .validator(Self::validate_kernel_config),
+            )
+            .get_matches_from_safe(cmdline_args)
+            .map_err(|_| "Failed to parse command line arguments".to_string())?;


The errors returned by validator()s should be propagated here.

Also, can we avoid calling the try_from()s twice for every arg?

I haven't figured out a way to do either. To me it seemed like the easiest way to validate those strings was to try and cast them into *Config objects, and consider them valid if the cast succeeds. As this CLI is meant to be for demonstrative purposes only, and throw-away, I didn't think too much of anything fancier 😄 The cleanest way IMO would be to replace clap like you mentioned in another comment, in the meantime I think we can compromise on the CLI at least until we commit to something as production-worthy as the VMM - WDYT?

I agree we can do for now some compromises on the CLI, but I assume people who will want to use this crate, in general will try at first to see how does the vmm work by using our CLI (I also think it will take some time to replace clap :( ). I think it's a pretty bad user experience if we log only "Failed to parse command line arguments" for each type of command line error and it seemed we could easily avoid it by doing just:

.map_err(|e| format!("Failed to parse command line arguments {:?}", e))?;

(even though Debug doesn't exactly do the best thing here).

Anyway, I played around a bit and a possible solution here could be:

remove the .validator() parts since we validate the arguments' values when we try to build VMMConfig too. Keeping also the validator()s doesn't give us much more useful error information.

call get_matches_from() instead of get_matches_from_safe() which will nicely log an error message if, for example, we give an unexpected argument (looks like we avoid panicking too and map_err is no longer needed).

This is just a suggestion (and it can be improved I guess).

You're absolutely right that the user experience is pretty crappy and needs more work, and thanks for the reminder of why we're doing this in the first place 😄
I'm working on nicer errors.

call get_matches_from() instead of get_matches_from_safe()

The _safe one returns a testable Result. The other one exits the process (not even a panic...), so there's no way to unit test the CLI interaction. How about a compromise: keep the _safe() one and the tests, but print the usage in case of errors?

lauralt · 2020-09-11T13:01:22Z

src/cli/Cargo.toml

+edition = "2018"
+
+[dependencies]
+clap = "2.33.3"


Since this is a pretty huge dependency, a future improvement here would be to have our own arg parser.

src/cli/src/lib.rs

lauralt · 2020-09-11T13:53:59Z

src/main.rs

+            env::args()
+                .collect::<Vec<String>>()
+                .iter()
+                .map(|s| s.as_str())


Can we avoid this map() and have a &[String] for cmdline_args argument type?

I started off with Strings, then reverted to slices while writing the tests - everything had to be .to_string()'ed and .clone()'ed. It made the code overall more verbose and clone-y.

Yep, I had the same problem when replacing clap. I kind of prefer to make the source code simpler at the expense of tests that are a bit less readable, but definitely no strong preference here.

docs/DESIGN.md

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

lauralt

A few other comments. Still have to review the vmm crate.

docs/DESIGN.md

lauralt · 2020-09-23T07:34:23Z

docs/DESIGN.md

+1. Configure legacy devices. This is done partially through `kvm-ioctls`,
+   partially (serial console emulation) through `vm-superio`. Device event
+   handling is mediated with `event-manager`.
+    1. Requirements: KVM is configured, guest memory is configured, `irqchip`


nit: full stop.

lauralt · 2020-09-23T08:27:22Z

docs/DESIGN.md

@@ -0,0 +1,243 @@
+# `rust-vmm` reference VMM Design


Really nice documentation 💯

docs/getting-started.md

lauralt · 2020-09-23T11:28:01Z

src/cli/src/lib.rs

+                    .validator(Self::validate_kernel_config),
+            )
+            .get_matches_from_safe(cmdline_args)
+            .map_err(|_| "Failed to parse command line arguments".to_string())?;


I agree we can do for now some compromises on the CLI, but I assume people who will want to use this crate, in general will try at first to see how does the vmm work by using our CLI (I also think it will take some time to replace clap :( ). I think it's a pretty bad user experience if we log only "Failed to parse command line arguments" for each type of command line error and it seemed we could easily avoid it by doing just:

.map_err(|e| format!("Failed to parse command line arguments {:?}", e))?;

(even though Debug doesn't exactly do the best thing here).

Anyway, I played around a bit and a possible solution here could be:

remove the .validator() parts since we validate the arguments' values when we try to build VMMConfig too. Keeping also the validator()s doesn't give us much more useful error information.

call get_matches_from() instead of get_matches_from_safe() which will nicely log an error message if, for example, we give an unexpected argument (looks like we avoid panicking too and map_err is no longer needed).

This is just a suggestion (and it can be improved I guess).

src/cli/src/lib.rs

lauralt · 2020-09-23T12:04:43Z

src/main.rs

+            env::args()
+                .collect::<Vec<String>>()
+                .iter()
+                .map(|s| s.as_str())


Yep, I had the same problem when replacing clap. I kind of prefer to make the source code simpler at the expense of tests that are a bit less readable, but definitely no strong preference here.

lauralt · 2020-09-25T07:08:41Z

src/vmm/tests/integration_tests.rs

+    // Sanity check. Because the VMM runs in a separate process, if the file doesn't exist,
+    // all we see is a different exit code than 0.
+    assert!(kernel_path.as_path().exists());


While reviewing rust-vmm/rust-vmm-ci#37, I noticed that even if this path doesn't exist, so the assert fails/should fail, the test is still passing. I assume this is not the desired behavior. Can you expand a bit on how is this exit code propagated further?

Great catch! The default panic handler exited with 0, so the parent process thought the child was on the happy path. I moved the assert before the fork to avoid the unnecessary new process on the error path, and installed a panic hook that exits a panicky process with a nonzero exit code.

Fixes rust-vmm#5 Fixes rust-vmm#12 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

Fixes rust-vmm#3 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

The simple command line parser creates `*Config` objects for the VMM from plaintext parameters, in `key=value` format, delimited by commas. Example: vmm-reference \ --memory mem_size_mib=1024 \ --vcpus num_vcpus=1 \ --kernel path=/path/to/vmlinux,zeropg=1234,cmdline="pci=off" Fixes rust-vmm#6 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

Fixes rust-vmm#7 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

...that doesn't really test anything. The test is a scaffold for future iterations of the reference VMM, which will be able to programmatically communicate with the outside world (via devices / metrics). Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

Fixes rust-vmm#2 Fixes rust-vmm#8 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

The build-deps pipeline runs _before_ all other vmm-reference pipelines in Buildkite, and (as suggested by the name) builds dependencies for the ensuing tests (for now, a kernel image). The /tmp directory on the Buildkite machine is used as a local cache. Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

Wait for the VMM to boot, and then write `reboot -f` and check that it was executed successfully. Signed-off-by: Andreea Florescu <fandree@amazon.com> Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

lauralt

Looks good, just a few nits/leftovers, but they can be fixed in a subsequent PR.

lauralt · 2020-10-28T08:24:48Z

resources/kernel/make_busybox.sh

+    rm -f sda && mknod sda b 8 0
+    rm -f console && mknod console c 5 1


lauralt · 2020-10-28T08:57:16Z

src/vmm/src/lib.rs

+#[derive(Debug)]
+pub enum MemoryError {
+    /// Failure during guest memory operation.
+    GuestMemory(GuestMemoryError),


nit: looks like this could still be removed

lauralt · 2020-10-28T09:30:36Z

resources/kernel/make_busybox.sh

+# Supported arguments:
+# * `-h`: build a guest that halts.
+# * `-j`: compile with `make -j` (on all available CPUs).
+#         TODO: pass the value for `-j`.


Also passing the VMLINUX name as arg could be a good TODO. Looks like if we modify something in the script (and we don't want to recompile the kernel), we have to also change that name, otherwise the old image will be copied (or maybe we could overwrite de old image :-?).
Can we open an issue for future improvements to this script so we don't forget about them?

lauralt · 2020-10-28T09:45:30Z

resources/kernel/make_busybox.sh

+mount -t sysfs none /sys
+/bin/echo "                                                   "
+/bin/echo "                 _                                 "
+/bin/echo "  _ __ _   _ ___| |_    __   ___ __ ___  _ __ ___  "
+/bin/echo " | '__| | | / __| __|___\ \ / / '_ \ _ \| '_ \ _ \ "
+/bin/echo " | |  | |_| \__ \ ||_____\ V /| | | | | | | | | | |"
+/bin/echo " |_|   \__,_|___/\__|     \_/ |_| |_| |_|_| |_| |_|"
+/bin/echo "                                                   "
+/bin/echo "                                                   "
+/bin/echo "Hello, world, from the rust-vmm reference VMM!"
+EOF


Suggested change

mount -t sysfs none /sys

/bin/echo " "

/bin/echo " _ "

/bin/echo " _ __ _ _ ___| |_ __ ___ __ ___ _ __ ___ "

/bin/echo " | '__| | | / __| __|___\ \ / / '_ \ _ \| '_ \ _ \ "

/bin/echo " | | | |_| \__ \ ||_____\ V /| | | | | | | | | | |"

/bin/echo " |_| \__,_|___/\__| \_/ |_| |_| |_|_| |_| |_|"

/bin/echo " "

/bin/echo " "

/bin/echo "Hello, world, from the rust-vmm reference VMM!"

EOF

mount -t sysfs none /sys

mknod -m 666 /dev/ttyS0 c 4 64

/bin/echo " "

/bin/echo " _ "

/bin/echo " _ __ _ _ ___| |_ __ ___ __ ___ _ __ ___ "

/bin/echo " | '__| | | / __| __|___\ \ / / '_ \ _ \| '_ \ _ \ "

/bin/echo " | | | |_| \__ \ ||_____\ V /| | | | | | | | | | |"

/bin/echo " |_| \__,_|___/\__| \_/ |_| |_| |_|_| |_| |_|"

/bin/echo " "

/bin/echo " "

/bin/echo "Hello, world, from the rust-vmm reference VMM!"

setsid cttyhack sh

EOF

Using this get around, /bin/sh: can't access tty; job control turned off won't be printed anymore after the Hello, world, ...: https://stackoverflow.com/questions/36529881/qemu-bin-sh-cant-access-tty-job-control-turned-off. The other answer from here might be nicer, but I didn't manage to make it work yet.

alxiord force-pushed the m0 branch from 0ab28d8 to 8da87a4 Compare September 8, 2020 12:52

lauralt reviewed Sep 8, 2020

View reviewed changes

alxiord force-pushed the m0 branch 4 times, most recently from bd8326b to b862ff6 Compare September 8, 2020 18:43

andreeaflorescu previously requested changes Sep 9, 2020

View reviewed changes

alxiord force-pushed the m0 branch 6 times, most recently from 41907fe to 81c937b Compare September 9, 2020 15:24

alxiord force-pushed the m0 branch from 7959b5b to da10fc0 Compare September 10, 2020 12:59

lauralt reviewed Sep 11, 2020

View reviewed changes

lauralt mentioned this pull request Sep 11, 2020

Add support for workpace tests rust-vmm/rust-vmm-ci#38

Closed

alxiord force-pushed the m0 branch 3 times, most recently from cda7158 to c8f0b52 Compare September 22, 2020 08:55

Skeleton implementation

5c618a7

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

alxiord force-pushed the m0 branch from c8f0b52 to e2989dc Compare September 22, 2020 12:27

lauralt reviewed Sep 23, 2020

View reviewed changes

alxiord force-pushed the m0 branch 3 times, most recently from 1139668 to 0fd9701 Compare September 24, 2020 13:27

lauralt reviewed Sep 25, 2020

View reviewed changes

alxiord force-pushed the m0 branch from 0fd9701 to 26e43e8 Compare September 25, 2020 13:03

alxiord force-pushed the m0 branch 2 times, most recently from 70ff97f to 21b69b1 Compare October 27, 2020 15:24

Alexandra Iordache added 18 commits October 28, 2020 10:48

Configure the serial console

1c3f37b

Fixes rust-vmm#5 Fixes rust-vmm#12 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

vcpu configuration: mptables

f76485a

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

vcpu: configure cpuid

1cccda7

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

vcpu: configure msrs

9784546

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

vcpu: configure regs, sregs, fpu

1dbc603

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

vcpu: configure LAPICs

f8a24d3

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

vcpu: implement emulation loop

c4d472c

Fixes rust-vmm#3 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

resources: add kernel & steps to build

88989b1

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

rust-vmm-ci submodule update

860cbdd

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

cleanup warnings

1750f6a

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

minimal integration test that boots the demo

c102ef5

Fixes rust-vmm#7 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

fix clippy lints

b777316

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

Add more unit tests and update coverage

0879ab9

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

aarch64: Fix build

292cc94

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

toml update: workspace, authors & vm-superio

bed6295

Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

readme update & design document

420ba15

Fixes rust-vmm#2 Fixes rust-vmm#8 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

alxiord force-pushed the m0 branch 2 times, most recently from 48088c1 to faaf9e3 Compare October 28, 2020 09:12

Alexandra Iordache and others added 2 commits October 28, 2020 11:15

tests: communicate with vmm serial

da09e16

Wait for the VMM to boot, and then write `reboot -f` and check that it was executed successfully. Signed-off-by: Andreea Florescu <fandree@amazon.com> Signed-off-by: Alexandra Iordache <aghecen@amazon.com>

alxiord force-pushed the m0 branch 2 times, most recently from 71a693e to da09e16 Compare October 28, 2020 09:27

andreeaflorescu approved these changes Oct 28, 2020

View reviewed changes

lauralt approved these changes Oct 28, 2020

View reviewed changes

lauralt merged commit a715248 into rust-vmm:master Oct 28, 2020

alxiord deleted the m0 branch November 3, 2020 17:22

		rm -f sda && mknod sda b 8 0
		rm -f console && mknod console c 5 1

Reference VMM MVP #11

Reference VMM MVP #11

Conversation

alxiord commented Sep 8, 2020

Choose a reason for hiding this comment

andreeaflorescu left a comment

Choose a reason for hiding this comment

alxiord commented Sep 9, 2020 • edited Loading

jdub commented Sep 10, 2020

lauralt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lauralt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alxiord Sep 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lauralt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lauralt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alxiord commented Sep 9, 2020 •

edited

Loading

alxiord Sep 24, 2020 •

edited

Loading