Skip to content

Conversation

simongdavies
Copy link
Contributor

Fixes a race condition where a sandbox kill arrives after a sandbox has successfully exited causing the subsequent run to fail.

There is a breaking change in this PR, previously if kill was called on an InterruptHandle before or while a guest call was not in progress the next guest call made on the Sandbox would be cancelled , now this scenario is a no-op. kill only takes effect if there is a guest call running.

Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
…sted by us

Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
/// retrying until either:
/// - The signal is successfully delivered (VCPU transitions from running to not running)
/// - The VCPU stops running for another reason (e.g., call completes normally)
/// - No call is active (call_active=false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it block? I thought it would just return false?

Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
// The virtualization stack can use this function to return the control
// of a virtual processor back to the virtualization stack in case it
// needs to change the state of a VM or to inject an event into the processor
debug!("Internal cancellation detected, returning Retry error");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jsturtevant and others added 7 commits October 17, 2025 17:43
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
/// This is acceptable because the generation tracking provides an additional
/// safety layer. Even if a stale kill somehow stamped cancel_requested, the
/// generation mismatch would cause it to be ignored.
call_active: AtomicBool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this flag is redundant. We will already not send any signals when calling kill if the vcpu is not running

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bugfix For PRs that fix bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants