Skip to content

Conversation

buddhisthead
Copy link
Collaborator

@buddhisthead buddhisthead commented Oct 1, 2025

This PR adds a URI-like binary resolver, whose primary use is to enable passing large chunks of data over the message bus by reference and hiding the implementation from the consumer. We anticipate it's use being first in the snapshot booting process where we pass chunks of blocks to the various state modules.

The binary URI-like API has a couple of initial formats including raw bytes and a file with offset and length. Usage is that a producer of the URI creates it with a unique ID and supplies a format enum. The consumer calls a global resolver which interprets the format enum and appropriately returns the bytes using a generic function signature, so new formats and resolver traits can be supplied in the future.

The PR contains unit and integration tests.

Running Tests

Unit tests:
cargo test -p acropolis_common --lib resolver::

Integration tests:

$ cargo test --test loc_over_bus -p acropolis_common

@Copilot Copilot AI review requested due to automatic review settings October 1, 2025 19:28
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a URI-like binary resolver system to efficiently handle large data chunks over message buses. The primary purpose is to enable passing binary data by reference rather than by value, with initial use planned for the snapshot booting process.

Key changes:

  • Added a new resolver module with Loc, Registry, and Resolver types for managing memory-mapped files and inline data
  • Implemented comprehensive unit tests covering inline payloads, file-backed resolution, eviction, bounds checking, and concurrency
  • Created integration tests demonstrating end-to-end usage over the Caryatid message bus

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
common/src/resolver.rs Core resolver implementation with Registry for memory-mapped files and Loc for uniform data location
common/src/lib.rs Module declaration to expose the resolver module
common/Cargo.toml Added dependencies for bytes, memmap2, tempfile, and test infrastructure
common/tests/loc_over_bus.rs Integration test demonstrating Loc serialization and resolution over message bus
common/tests/test.toml Configuration file for integration test environment

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

impl Registry {
/// Register a file for memory-mapped access in the registry.
pub fn register_file(&self, store: StoreId, object: ObjectId, file: &File) -> Result<()> {
let mmap = unsafe { Mmap::map(file) }.context("mmap failed")?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is essential but we've managed to avoid any unsafe so far

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I'll clear that out.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do end up needing unsafe, it's good practice to include a comment with every unsafe block explaining thoroughly why we, as humans, can be sure it's safe when the compiler cannot.

(unsafe is a bit of a poorly chosen word, IMO; it more so means that the compiler can't check the safety, and so the human must instead)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants