Skip to content

Remove hypervisor_handler thread #533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ludfjig
Copy link
Contributor

@ludfjig ludfjig commented May 27, 2025

  • Removes the per-vm hypervisor_handler thread in favor of running on the callers thread.
  • Removes all timeout-based cancellation in favor of an explicit kill() api. Currently, one can only cancel guest function calls, not the first vm initialization, but this should be fine as long as you trust your guest binary. You can only interrupt a guest while it's in a blocking call to vcpufd.run(). Host function calls can still not be interrupted (Make it possible to kill guest execution when running a host function. #192)

These changes should improve performance and throughput. It should also avoid the incredible performance drop off we observed under load when the hypervisor handler thread required joining, after cancelling guest execution. 

Added API changes:

impl MultiUseSandbox{
    /// Get a handle to the interrupt handler for this sandbox,
    /// capable of interrupting guest execution.
    pub fn interrupt_handle(&self) -> Arc<dyn InterruptHandle> {
         ...  
    }
    ...
}
/// A trait for handling interrupts to a sandbox's vcpu
pub trait InterruptHandle: Send + Sync {
    /// Interrupt the corresponding sandbox's vcpu if it's running.
    ///
    /// - If this is called while the vcpu is running, then it will interrupt the vcpu and return `true`.
    /// - If this is called while the vcpu is not running, then it will do nothing and return `false`.
    ///
    /// # Note
    /// This function will block for the duration of the time it takes for the vcpu thread to be interrupted.
    fn kill(&self) -> bool;

    /// Returns true iff the corresponding sandbox has been dropped
    fn dropped(&self) -> bool;
}

Removed API changed:

  • All timeout based configuration

closes #471

Note: On KVM, moving the vcpufd (sandbox) to a new thread will incur a performance overhead the first time the vcpu in ran on the new thread, as per kvm kernel docs

@ludfjig ludfjig force-pushed the cancel_vm3 branch 6 times, most recently from eb85e5f to 4abdd49 Compare May 27, 2025 22:56
@ludfjig ludfjig added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label May 27, 2025
@ludfjig ludfjig force-pushed the cancel_vm3 branch 4 times, most recently from c4d0a52 to 210e506 Compare May 28, 2025 00:14
@ludfjig ludfjig marked this pull request as ready for review May 28, 2025 00:52
Copy link
Contributor

@simongdavies simongdavies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments/questions , not quite finished reviewing , I will finish it in the morning

@@ -72,7 +71,8 @@ impl MultiUseGuestCallContext {
// !Send (and !Sync), we also don't need to worry about
// synchronization

call_function_on_guest(&mut self.sbox, func_name, func_ret_type, args)
self.sbox
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change to the _no_reset version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previous call_function_on_guest in removed file guest_dispatch.rs didn't reset memory either, I just renamed it to make that more clear. I moved the method to MultiUseSandbox because I'm not sure what purpose guest_dispatch.rs served now that SingleUsedSandbox is removed.

Justfile Outdated
cargo test {{ if features =="" {''} else if features=="no-default-features" {"--no-default-features" } else {"--no-default-features -F " + features } }} --profile={{ if target == "debug" { "dev" } else { target } }} -p hyperlight-host --lib -- metrics::tests::test_metrics_are_emitted --exact --ignored
cargo test {{ if features =="" {''} else if features=="no-default-features" {"--no-default-features" } else {"--no-default-features -F function_call_metrics," + features } }} --profile={{ if target == "debug" { "dev" } else { target } }} -p hyperlight-host --lib -- metrics::tests::test_metrics_are_emitted --exact --ignored


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did these tests get deleted? They don't see related to this change, I guess we potentially have different metrics now (although arguably we might want to have a kill metric and metrics about success of killing etc.) but even if we don't we still have some metrics?

Copy link
Contributor Author

@ludfjig ludfjig May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have different metrics now. The test itself changed to use a local metrics recorder, which makes it capable to only record the metrics emitted in the given test. The local metrics recorder is able to only capture metrics that were emitted while the local recorder is active, and that are emitted on the same thread. We couldn't use the local metrics recorder before, because the metrics were emitted from a separate thread (hypervisor_handler thread), and local metrics recorder only worked on the given thread.

In short, we no longer require #[ignore] on the test, so it is ran as part of the other tests. Your comment made me realize we should still test it with the metrics feature flag so I added back that one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that deleting this file has deleted a bunch of tests that we would want to keep (e.g. seccomp tests etc.) I couldn't see that these had been relocated anywhere?

Copy link
Contributor Author

@ludfjig ludfjig May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops you are correct. I added them back to initializedmultiuse.rs (I didn't keep the ones that made no sense)

@@ -89,17 +89,29 @@ fn main() -> Result<()> {
let no_op = Noop::<UninitializedSandbox, MultiUseSandbox>::default();

let mut multiuse_sandbox = usandbox.evolve(no_op)?;
let interrupt_handle = multiuse_sandbox.interrupt_handle();

const NUM_CALLS: i32 = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Are we logging attempts and results of kills? I assume so given this code has changed , if we are not logging when we kill we should

Copy link
Contributor Author

@ludfjig ludfjig May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -102,10 +102,22 @@ fn do_hyperlight_stuff() {
let no_op = Noop::<UninitializedSandbox, MultiUseSandbox>::default();

let mut multiuse_sandbox = usandbox.evolve(no_op).expect("Failed to evolve sandbox");
let interrupt_handle = multiuse_sandbox.interrupt_handle();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as logging? are we creating metrics on how many times kill has been called/ succeeded etc.? If not we probably should

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have same metrics as before, which is how many sandbox cancellations have occured (successfully)


// Call a function that gets cancelled by the host function 5 times to generate some log entries.
const NUM_CALLS: i32 = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as the logging example, if we log then we should be creating trace records as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep the metrics are still there, and I added logging for kill attempts

@@ -390,6 +393,11 @@ impl HypervLinuxDriver {
mem_regions,
entrypoint: entrypoint_ptr.absolute()?,
orig_rsp: rsp_ptr,
interrupt_handle: Arc::new(LinuxInterruptHandle {
running: AtomicBool::new(false),
tid: AtomicU64::new(unsafe { libc::pthread_self() }),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the thread id here? what happens if the sandbox is created on thread 1 , then used on thread 2 and killed will we send the kill signal to the wrong thread?

Also is it possible that sandbox a and b are created on thread 1 then guest func is called on sandbox a , then host tries to kill guest func call ,meanwhile func_call ends on sandbox a and thread 1 calls func on sandbox b , then we send kill signal and kill the wrong one?

If we dont have tests for these scenarios then we should (and more that I haven't thought of)

Copy link
Contributor Author

@ludfjig ludfjig May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll never send the signal to the wrong thread, because tid is always set before running, and we only send signal when running is set.

However, it's possible that killing a sandbox on thread 1 will accidentally kill a new sandbox on the same thread. As far as I know, I don't know of any way to avoid this.

  • Thread 1 creates sandbox, finishes its call to vcpu::run(), but not yet set running to false
  • Thread 2 calls kill(), and enters the while loop because running is still set to true
  • Thread 1 sets running to false, and on this thread we create a new sandbox, and call vcpu::run on the new sandbox
  • Thread 2 sends the signal
  • Thread 1's vcpu of the new sandbox gets interrupted, but the user intended to kill the old one :/

This occurs because the checking running and sending the signal is not an atomic operation, and I'm not sure how to avoid it. Basically, between the green and red line, anything could happen on another thread

image

We'll at most send 1 incorrect signal, because the next checking of running in the while-loop is guaranteed to be false

// Note: if a `InterruptHandle::kill()` signal is delivered to this thread **here**
// - after we've set the running to true,
// Then the signal does not have any effect, because the signal handler is a no-op.
// This is fine since we are already done with the `VcpuFd::run()` call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this statement is true, for example the reason for the exit may be to call a host function in which case we want to stop the execution, I think that this scenario is handled in the current implementation

Copy link
Contributor Author

@ludfjig ludfjig May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation, if the user-thread times out while waiting for hypervisor_handler thread, the user-thread returns control back. However, it doesn't actually stop the host function call from happening, after the host-thread has returned control. Let's add host-function cancellation in another PR since it was never properly supported.

@@ -634,58 +666,8 @@ impl Hypervisor for KVMDriver {
}
}

#[cfg(test)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did these tests get deleted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted in kvm/windows/mshv in favor of running the test from mod.rs, since they all do the same thing regardless


#[cfg(gdb)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this get deleted? We still want to test initialize don't we? Its still being executed just not via the handler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added back! Whopsie

@ludfjig ludfjig force-pushed the cancel_vm3 branch 10 times, most recently from 34a5772 to baf9912 Compare May 30, 2025 22:18
ludfjig added 5 commits May 30, 2025 16:58
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
…eventing host functions to return control to guest after being interrupted

Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove the Sandbox Execution thread , replace elapsed time based timeouts with a Kill function
2 participants