-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reference VMM MVP #11
Conversation
README.md
Outdated
cargo run \ | ||
--guest-memory=1024 \ | ||
--vcpus=1 \ | ||
--kernel=/path/to/vmlinux \ | ||
--cmdline="cmdline" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be updated either here or in #1 to its current form.
bd8326b
to
b862ff6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not add binary blobs to GitHub:
- vmlinux-hello-busybox
- vmlinux-hello-busybox-halt
41907fe
to
81c937b
Compare
Replaced with a post-checkout hook that builds the 2nd image, which is used in the tests. |
This is incredibly readable and straightforward! 🤘🏻 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First part of review.
I'm a bit intrigued by the Cargo.toml
workspace changes and how they deal with cargo build
/cargo run
commands. Would really appreciate if someone has more info here.
|
||
... | ||
```bash | ||
vmm-reference \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new workspace changes in Cargo.toml
it looks like we have to run cargo build --workspace
to generate the vmm-reference
binary. This should be mentioned somewhere.
Also, I would add here a little Getting started or something like this, in which to mention the exact steps for successfully running a reference vmm.
Moreover, I would add the console in cmdline
for user to be able to see the guest output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a doc as the getting started
section might get fluffier once we start adding devices and extending platform support. Added the cmdline
in the example too.
I reverted the workspace settings to the previous ones (empty [workspace]
), but we'll have to fix rust-vmm-ci and update the hook before the tests pass again 😄
tests/test_run_reference_vmm.py
Outdated
# test until the child process ends, which it doesn't. So we have to trust | ||
# that the output is there, let the VMM run for a bit, then kill it. | ||
# In the future, it will communicate via metrics / devices. | ||
vmm_process = subprocess.Popen(vmm_cmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, are we sure that this test actually runs vmm_cmd
successfully? subprocess.Popen
seems to have a weird behavior (i.e. it fails, from what I noticed, only if the binary doesn't exist (in this case cargo
, which exists)).
Wouldn't it be better to run subprocess.run
for example or something else?
Also, funny question: is the cargo run ...
option still valid with the workspace changes that I mentioned in another comment? I didn't manage to run it successfully now, but I might be doing something really wrong :-?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike Popen
, subprocess.run
doesn't end until the child process ends - so the test would be stuck in it.
As for the workspace stuff, I'm looking into those, it's not clear to me yet how we want to proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I manually tested, the things mentioned here: rust-vmm/rust-vmm-ci#38 seem to fix the workspace issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked again, the test is still passing if, for example, we replace run
with blah
in vmm_cmd
. In this case, the test is not useful. I'll try looking more into how we can make this test fail if the command is invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think testing cargo blah
instead of cargo run
tests cargo
more than it does the VMM 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was just an example, replacing --memory
with some invalid arg will keep the test passing :(. In my opinion we should make sure that whatever subprocess
function we choose, the test is passing only if cargo run ...
(or even cargo blah
) is successful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second part of review.
|
||
## CLI reference | ||
|
||
* `memory` - guest memory configurations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to have a small introduction here.
src/cli/src/lib.rs
Outdated
.validator(Self::validate_kernel_config), | ||
) | ||
.get_matches_from_safe(cmdline_args) | ||
.map_err(|_| "Failed to parse command line arguments".to_string())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The errors returned by validator()
s should be propagated here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can we avoid calling the try_from()
s twice for every arg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't figured out a way to do either. To me it seemed like the easiest way to validate those strings was to try and cast them into *Config
objects, and consider them valid if the cast succeeds. As this CLI is meant to be for demonstrative purposes only, and throw-away, I didn't think too much of anything fancier 😄 The cleanest way IMO would be to replace clap
like you mentioned in another comment, in the meantime I think we can compromise on the CLI at least until we commit to something as production-worthy as the VMM - WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we can do for now some compromises on the CLI, but I assume people who will want to use this crate, in general will try at first to see how does the vmm
work by using our CLI (I also think it will take some time to replace clap
:( ). I think it's a pretty bad user experience if we log only "Failed to parse command line arguments" for each type of command line error and it seemed we could easily avoid it by doing just:
.map_err(|e| format!("Failed to parse command line arguments {:?}", e))?;
(even though Debug doesn't exactly do the best thing here).
Anyway, I played around a bit and a possible solution here could be:
- remove the
.validator()
parts since we validate the arguments' values when we try to buildVMMConfig
too. Keeping also thevalidator()
s doesn't give us much more useful error information. - call
get_matches_from()
instead ofget_matches_from_safe()
which will nicely log an error message if, for example, we give an unexpected argument (looks like we avoid panicking too andmap_err
is no longer needed).
This is just a suggestion (and it can be improved I guess).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right that the user experience is pretty crappy and needs more work, and thanks for the reminder of why we're doing this in the first place 😄
I'm working on nicer errors.
call
get_matches_from()
instead ofget_matches_from_safe()
The _safe
one returns a testable Result
. The other one exit
s the process (not even a panic
...), so there's no way to unit test the CLI interaction. How about a compromise: keep the _safe()
one and the tests, but print the usage in case of errors?
edition = "2018" | ||
|
||
[dependencies] | ||
clap = "2.33.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a pretty huge dependency, a future improvement here would be to have our own arg parser.
env::args() | ||
.collect::<Vec<String>>() | ||
.iter() | ||
.map(|s| s.as_str()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid this map()
and have a &[String]
for cmdline_args
argument type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started off with String
s, then reverted to slices while writing the tests - everything had to be .to_string()
'ed and .clone()
'ed. It made the code overall more verbose and clone
-y.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I had the same problem when replacing clap
. I kind of prefer to make the source code simpler at the expense of tests that are a bit less readable, but definitely no strong preference here.
cda7158
to
c8f0b52
Compare
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few other comments. Still have to review the vmm
crate.
1. Configure legacy devices. This is done partially through `kvm-ioctls`, | ||
partially (serial console emulation) through `vm-superio`. Device event | ||
handling is mediated with `event-manager`. | ||
1. Requirements: KVM is configured, guest memory is configured, `irqchip` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: full stop.
@@ -0,0 +1,243 @@ | |||
# `rust-vmm` reference VMM Design |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice documentation 💯
src/cli/src/lib.rs
Outdated
.validator(Self::validate_kernel_config), | ||
) | ||
.get_matches_from_safe(cmdline_args) | ||
.map_err(|_| "Failed to parse command line arguments".to_string())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we can do for now some compromises on the CLI, but I assume people who will want to use this crate, in general will try at first to see how does the vmm
work by using our CLI (I also think it will take some time to replace clap
:( ). I think it's a pretty bad user experience if we log only "Failed to parse command line arguments" for each type of command line error and it seemed we could easily avoid it by doing just:
.map_err(|e| format!("Failed to parse command line arguments {:?}", e))?;
(even though Debug doesn't exactly do the best thing here).
Anyway, I played around a bit and a possible solution here could be:
- remove the
.validator()
parts since we validate the arguments' values when we try to buildVMMConfig
too. Keeping also thevalidator()
s doesn't give us much more useful error information. - call
get_matches_from()
instead ofget_matches_from_safe()
which will nicely log an error message if, for example, we give an unexpected argument (looks like we avoid panicking too andmap_err
is no longer needed).
This is just a suggestion (and it can be improved I guess).
env::args() | ||
.collect::<Vec<String>>() | ||
.iter() | ||
.map(|s| s.as_str()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I had the same problem when replacing clap
. I kind of prefer to make the source code simpler at the expense of tests that are a bit less readable, but definitely no strong preference here.
1139668
to
0fd9701
Compare
// Sanity check. Because the VMM runs in a separate process, if the file doesn't exist, | ||
// all we see is a different exit code than 0. | ||
assert!(kernel_path.as_path().exists()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While reviewing rust-vmm/rust-vmm-ci#37, I noticed that even if this path doesn't exist, so the assert fails/should fail, the test is still passing. I assume this is not the desired behavior. Can you expand a bit on how is this exit code propagated further?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! The default panic handler exited with 0, so the parent process thought the child was on the happy path. I moved the assert
before the fork
to avoid the unnecessary new process on the error path, and installed a panic hook that exits a panicky process with a nonzero exit code.
70ff97f
to
21b69b1
Compare
Fixes rust-vmm#5 Fixes rust-vmm#12 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Fixes rust-vmm#3 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
The simple command line parser creates `*Config` objects for the VMM from plaintext parameters, in `key=value` format, delimited by commas. Example: vmm-reference \ --memory mem_size_mib=1024 \ --vcpus num_vcpus=1 \ --kernel path=/path/to/vmlinux,zeropg=1234,cmdline="pci=off" Fixes rust-vmm#6 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Fixes rust-vmm#7 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
...that doesn't really test anything. The test is a scaffold for future iterations of the reference VMM, which will be able to programmatically communicate with the outside world (via devices / metrics). Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Fixes rust-vmm#2 Fixes rust-vmm#8 Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
48088c1
to
faaf9e3
Compare
The build-deps pipeline runs _before_ all other vmm-reference pipelines in Buildkite, and (as suggested by the name) builds dependencies for the ensuing tests (for now, a kernel image). The /tmp directory on the Buildkite machine is used as a local cache. Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
Wait for the VMM to boot, and then write `reboot -f` and check that it was executed successfully. Signed-off-by: Andreea Florescu <fandree@amazon.com> Signed-off-by: Alexandra Iordache <aghecen@amazon.com>
71a693e
to
da09e16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few nits/leftovers, but they can be fixed in a subsequent PR.
rm -f sda && mknod sda b 8 0 | ||
rm -f console && mknod console c 5 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
#[derive(Debug)] | ||
pub enum MemoryError { | ||
/// Failure during guest memory operation. | ||
GuestMemory(GuestMemoryError), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: looks like this could still be removed
# Supported arguments: | ||
# * `-h`: build a guest that halts. | ||
# * `-j`: compile with `make -j` (on all available CPUs). | ||
# TODO: pass the value for `-j`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also passing the VMLINUX name as arg could be a good TODO. Looks like if we modify something in the script (and we don't want to recompile the kernel), we have to also change that name, otherwise the old image will be copied (or maybe we could overwrite de old image :-?).
Can we open an issue for future improvements to this script so we don't forget about them?
mount -t sysfs none /sys | ||
/bin/echo " " | ||
/bin/echo " _ " | ||
/bin/echo " _ __ _ _ ___| |_ __ ___ __ ___ _ __ ___ " | ||
/bin/echo " | '__| | | / __| __|___\ \ / / '_ \ _ \| '_ \ _ \ " | ||
/bin/echo " | | | |_| \__ \ ||_____\ V /| | | | | | | | | | |" | ||
/bin/echo " |_| \__,_|___/\__| \_/ |_| |_| |_|_| |_| |_|" | ||
/bin/echo " " | ||
/bin/echo " " | ||
/bin/echo "Hello, world, from the rust-vmm reference VMM!" | ||
EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mount -t sysfs none /sys | |
/bin/echo " " | |
/bin/echo " _ " | |
/bin/echo " _ __ _ _ ___| |_ __ ___ __ ___ _ __ ___ " | |
/bin/echo " | '__| | | / __| __|___\ \ / / '_ \ _ \| '_ \ _ \ " | |
/bin/echo " | | | |_| \__ \ ||_____\ V /| | | | | | | | | | |" | |
/bin/echo " |_| \__,_|___/\__| \_/ |_| |_| |_|_| |_| |_|" | |
/bin/echo " " | |
/bin/echo " " | |
/bin/echo "Hello, world, from the rust-vmm reference VMM!" | |
EOF | |
mount -t sysfs none /sys | |
mknod -m 666 /dev/ttyS0 c 4 64 | |
/bin/echo " " | |
/bin/echo " _ " | |
/bin/echo " _ __ _ _ ___| |_ __ ___ __ ___ _ __ ___ " | |
/bin/echo " | '__| | | / __| __|___\ \ / / '_ \ _ \| '_ \ _ \ " | |
/bin/echo " | | | |_| \__ \ ||_____\ V /| | | | | | | | | | |" | |
/bin/echo " |_| \__,_|___/\__| \_/ |_| |_| |_|_| |_| |_|" | |
/bin/echo " " | |
/bin/echo " " | |
/bin/echo "Hello, world, from the rust-vmm reference VMM!" | |
setsid cttyhack sh | |
EOF |
Using this get around, /bin/sh: can't access tty; job control turned off
won't be printed anymore after the Hello, world, ...
: https://stackoverflow.com/questions/36529881/qemu-bin-sh-cant-access-tty-job-control-turned-off. The other answer from here might be nicer, but I didn't manage to make it work yet.
This PR introduces the first iteration of a rust-vmm-based reference VMM.