Skip to content

Conversation

@dblnz
Copy link
Contributor

@dblnz dblnz commented Oct 31, 2025

This is part of #964.

Description

This PR changes the way the guest reports the guest tracing data to the host from simply providing a pointer to the data in the guest to serializing it and providing a pointer to the serialized data in the guest.
The host side can the interpret that pointer as an offset in the shared memory, retrieve a copy of that buffer in the host, deserialize it and then parse the trace events to emit corresponding tracing spans/events on the host.

To do that, the following steps were necessary:

  1. Define a flatbuffers schema to represent the guest trace data used on the guest to store the events
  2. Implement wrappers on top of the generated flatbuffers Rust code to provide Rust types that the guest/host can use.
    These wrappers make use of the flatbuffers code to provide TryFrom and From implementations for the types used on the guest and host
  3. Change the tracing logic to use a single GuestEvent type that is an enum with the following variants:
    • OpenSpan - is created when a span is opened and stores all the corresponding info for that span including the TSC read on the guest
    • CloseSpan - is created when a span is closed and stores the TSC of when that happened
    • LogEvent - is created when a log is encountered
  4. Update the guest state logic to keep track of the active span and use the new GuestEvent defined above. When the guest data vector reaches the capacity, the data is serialized and the pointer and length of that buffer are written in the registers of the CPU right before calling the out instruction that yields control to the host
  5. Update the host context logic that:
    • retrieves the pointer and length from the vCPU's registers
    • gets exclusive access to the shared memory to retrieve the slice containing the serialized buffer
    • deserializes the buffer
    • parses the events to emit the corresponding trace events on the host (span/log)
    • At the same time, it keeps track of the active span in the guest so that all the new host spans created between this point and the time the guest starts executing again, are set as children of the guest span. This simulates a continuous call trace.

Special notes

I would advise special consideration to the following areas:

  • flatbuffer schema definition and serialization/deserialization
  • guest memory access from the host and exclusive memory handling in that case
  • guest buffer allocation and serialization of data

@dblnz dblnz added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Oct 31, 2025
@dblnz dblnz force-pushed the use-heap-for-guest-tracing branch 2 times, most recently from aaa356b to bc699db Compare November 3, 2025 13:13
@dblnz dblnz marked this pull request as ready for review November 3, 2025 15:09
ludfjig
ludfjig previously approved these changes Nov 4, 2025
Copy link
Contributor

@ludfjig ludfjig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. BTW what happens if guest is build with tracing feature, but host is not?

@ludfjig ludfjig requested a review from Copilot November 4, 2025 18:59
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the guest tracing system to use FlatBuffers for serialization instead of the heapless crate with fixed-size arrays. This allows for dynamic allocation and removes the constraints of pre-allocated buffer sizes.

Key changes:

  • Replaced heapless dependency with Vec-based collections for dynamic allocation
  • Introduced FlatBuffers schema (guest_trace_data.fbs) for trace data serialization
  • Changed from passing raw memory pointers to passing serialized data buffers between guest and host
  • Simplified event tracking by using a stream-based model (open/close events) instead of maintaining mutable span state

Reviewed Changes

Copilot reviewed 15 out of 27 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/schema/guest_trace_data.fbs New FlatBuffers schema defining guest trace data structures
src/schema/all.fbs Added include for new guest trace data schema
src/hyperlight_common/src/flatbuffer_wrappers/guest_trace_data.rs New module implementing serialization/deserialization for guest trace data
src/hyperlight_common/src/flatbuffer_wrappers/mod.rs Added guest_trace_data module export
src/hyperlight_common/src/flatbuffers/mod.rs Added exports for new FlatBuffers-generated types
src/hyperlight_common/src/flatbuffers/hyperlight/generated/*.rs Auto-generated FlatBuffers code for trace data structures
src/hyperlight_guest_tracing/src/lib.rs Removed heapless-based type definitions (Spans, Events, TraceLevel, etc.)
src/hyperlight_guest_tracing/src/state.rs Refactored to use Vec instead of heapless, changed to event-stream model
src/hyperlight_guest_tracing/src/visitor.rs Simplified field visitor to use String/Vec instead of heapless types
src/hyperlight_guest_tracing/Cargo.toml Removed heapless dependency
src/hyperlight_host/src/sandbox/trace/context.rs Updated to deserialize FlatBuffers and process event stream
src/hyperlight_host/src/hypervisor/*.rs Changed memory manager reference from immutable to mutable
src/hyperlight_guest/src/exit.rs Updated to pass serialized data pointer/length instead of raw struct pointers
Justfile Added test commands for new trace_guest feature
Cargo.lock & guest Cargo.lock files Removed heapless and its transitive dependencies

@dblnz
Copy link
Contributor Author

dblnz commented Nov 4, 2025

LGTM. BTW what happens if guest is build with tracing feature, but host is not?

The host crashes with:

called `Result::unwrap()` on an `Err` value: AnyhowError(Invalid OutBAction value: 104)

Because of https://github.com/hyperlight-dev/hyperlight/blob/main/src/hyperlight_host/src/sandbox/outb.rs#L155
stems from https://github.com/hyperlight-dev/hyperlight/blob/main/src/hyperlight_common/src/outb.rs#L117

I think it is ok to fail because unless the host expects this from the guest, it shall fail.

jsturtevant
jsturtevant previously approved these changes Nov 4, 2025
dblnz added 4 commits November 4, 2025 21:55
- The files have been generated using the `flatbuffers` v25.9.23
  that correctly generate `unsafe` code as opposed to previous versions

Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
@dblnz dblnz dismissed stale reviews from jsturtevant and ludfjig via ab93d7d November 4, 2025 19:56
@dblnz dblnz force-pushed the use-heap-for-guest-tracing branch from bc699db to ab93d7d Compare November 4, 2025 19:56
@dblnz dblnz merged commit ab93d7d into hyperlight-dev:main Nov 4, 2025
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants