Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a compartment initalisation framework #275

Open
PhilDay-CT opened this issue Jul 25, 2024 · 9 comments
Open

Add a compartment initalisation framework #275

PhilDay-CT opened this issue Jul 25, 2024 · 9 comments

Comments

@PhilDay-CT
Copy link
Contributor

Sometimes there is code in a compartment that you want to run once at startup - for example in the configuration broker example each parser needs to register a callback with the broker (to avoid hard coding the list of parers into to the broker).

Running a thread in each compartment to do this is wasteful and since threads aren't expected to exit means they are left hanging around. Doing it from a single thread means exposing a compartment-call that's not otherwise need.

Following the pattern of compartment_error_handler() with something like compartment_init() would work ?

@davidchisnall
Copy link
Collaborator

I agree that something in this space would be nice. There are a lot of problems that are similar to the ones with shared-library global initialisers.

To start with, this is what we have now:

  • C++ static initialisers work, so you can lazily initialise variables the first time they enter scope.
  • We build on this to run global constructors, so it's easy to have code that runs all constructors the first time that a compartment entry point is invoked.
  • Compartments that are thread entry points can run construction code early.
  • We can build barriers with futexes so that threads can block until some initialisation is run.

If we have a generic mechanism, a few problems arise immediately:

  • What order do the constructors run in? If I have two compartments, A and B, how do I express the idea that B's constructors depend on A's having run and so must happen first? What should happen in more complex cases where there's circular dependency (ideally, we'd catch that at link time somehow).
  • What thread do constructors run on and how do we ensure that it is large enough (both stack and trusted stack) for all constructor logic?
  • What happens if one of the constructors doesn't terminate? Does this block the boot? Do we also require another thread as a watchdog that can handle partial startup?
  • What happens if one compartment's initialisation faults and leaves it in an undefined state? Related to the first one, what happens if B has constructors that depend on A being initialised, but A faults during construction.

I don't think we can solve this in the general case (happy to be wrong!), but I would welcome suggestions for how we can at least provide some general infrastructure for the common cases.

Currently, the simplest pattern is:

  • Designate one thread to run constructors and have it run them on startup.
  • Block all other threads behind a barrier.
  • Add a policy that says that the initialisation entry points can be called only from the startup compartment.

The thread that runs the cleanup needs one trusted stack frame to handle the startup bits before it enters its main entry point. It can then define the initialisation order and what to do if any fail. If you want to handle infinite looping, a second thread can do a timed wait on the barrier and handle errors if the timeout is triggered. This requires a lot of bespoke code, so is not ideal.

rmn30 added a commit that referenced this issue Jan 22, 2025
This is useful because although sonata has a second UART its not currently initialised.
See #317 and #275 .
@nwf
Copy link
Member

nwf commented Jan 22, 2025

Musing a bit, forgive some half-baked thoughts.

I think it could be relatively straightforward to offer severely limited constructors, which...

  • are trusted for system availability
  • cannot make cross-compartment calls
  • run on some temporary stack (to be recycled as part of the heap, say), with stack memory zeroed on entry
  • have access only to one compartment's (or library's, even) imports and globals
    • imports implies access to statically shared objects, even if imported entry points are not likely to be useful
    • could also be granted access to all static sealed objects of a particular type, if we built and exposed those linkersets
  • are erased after use, with memory forming part of the heap like we do with export tables

I think the limitations here probably mean we don't have to worry about dependencies among such constructors. That said, despite the limitations, these seem good enough to...

  • get the UARTs off the ground

  • run static initializers statically rather than lazily at use (yes?)

  • walk through intra-compartment collections (of parsers in the configuration broker, say)

  • walk through collections of static objects (if we wanted, or wanted to later; one use case is that the allocator presently lazily initializes quota objects, but we could front that)

Does that seem like enough utility to merit the throw-weight of the requisite support code in the loader?

@davidchisnall
Copy link
Collaborator

I think it is useful. The loader code should mostly be preparing an array of cross-compartment callbacks.

The right place for this to run is probably the fake thread that the scheduler has, which also has no trusted stack and is the one the loader runs on.

rmn30 added a commit that referenced this issue Jan 23, 2025
This is useful because although sonata has a second UART its not currently initialised.
See #317 and #275 .
@PhilDay-CT
Copy link
Contributor Author

It sounds like there's a contradiction between the limitation on not being able to make cross-compartment calls, and the point about walking though intra-compartment collections ?

In the config broker initialisation each parser is making a cross-compartment call into the broker to register a call back - would that be supported here ?

rmn30 added a commit that referenced this issue Jan 23, 2025
This is useful because although sonata has a second UART its not currently initialised.
See #317 and #275 .
@davidchisnall
Copy link
Collaborator

It sounds like there's a contradiction between the limitation on not being able to make cross-compartment calls, and the point about walking though intra-compartment collections ?

I think that's the hardest thing to support. I don't really want to support cross-compartment calls because that makes it hard to reason about the initial state. If each initialiser is self contained, then you guarantee non-interference between two that run in compartments with access to disjoint sets of devices / pre-shared objects. If they can do cross-compartment calls, that's much harder.

Initialising global constructors and doing early device init are the two main use cases for me. The latter, in particular, where we could give a compartment with no entry points except the init code a richer set of permissions, would enable some useful things. For example, for the revoker, the revoker-init compartment could have access to the full device range and configure the start and end region for scanning, then the allocator could have access to only the range that lets it start revocation and read the epoch counter.

@nwf
Copy link
Member

nwf commented Jan 23, 2025

In the config broker initialisation each parser is making a cross-compartment call into the broker to register a call back - would that be supported here ?

Oh, I'd missed that point and was thinking about something more akin to linker sets to get collections of things from across a compartment's linkage to iterate at startup. It seems plausible that we could do cross-compartment linker sets, too, making the broker's registration of callbacks more declarative, but I'd want to expose such functionality to auditing as well. Will ponder further.

@davidchisnall
Copy link
Collaborator

I think we'd want each compartment to expose zero or one initialisation entry points. For the config broker, I'm not sure what the right way of doing that registration is. It feels like doing it at run time, rather than link time, is probably the wrong approach, but I'm willing to be convinced otherwise.

@PhilDay-CT
Copy link
Contributor Author

@nwf It could be that my design is an outlier - I was trying to keep it so that the broker was a compartment that could be included unchanged and just extended by adding other compartments around it (it's kind of in our DNA as a company to turn everything into a configuration problem rather that dev).

@nwf
Copy link
Member

nwf commented Jan 23, 2025

I think we'd want each compartment to expose zero or one initialisation entry points.

Sold. 1

I was trying to keep it so that the broker was a compartment that could be included unchanged and just extended by adding other compartments around it

An admirable goal indeed! I think we can keep it, declaratively and without the use of cross-compartment calls, atop the proposed limited framework above, with some restructuring and (quite a bit of) engineering. I think the steps are...

  1. Permit static sealed values to be statically initialized to be holding pointers to (the containing compartment's) export table entries. 2

  2. Redefine the PARSER_CONFIG_CAPABILITY types, conflating them with the broker's InternalConfigItem, so that the broker can build its configuration data structure in situ, without needing heap allocation. The audit framework can ensure that internal fields are statically zero.

    Use the affordance from point 1 to store a pointer to (one of) the compartment's parsing entry point(s) within (each of) its PARSER_CONFIG_CAPABILITY object(s).

  3. Collect sealed pointers to all static objects of each type in the system a linker set.

  4. Collect the set of such linker sets in another, in a type-indexed way.

  5. Grant the broker's compartment_init() function, and everybody else's, (read-only, shallow-non-capture 3) access to those (say, nullptr-terminated) linker set, in exchange for proving that they hold the unsealing type.

Then the broker's compartment_init() function could look something like...

void compartment_init(CHERI_SEALED(void *) (*staticSeals)(SKey key))
{
  // ...
  configData = nullptr;

  /*
   * Find all the ParserConfigKey-sealed static objects in the system and thread
   * them onto a linked list.  Keep the (unsealed) pointer to the head of this
   * list internally for our use.
   */
  auto configs = staticSeals(STATIC_SEALING_TYPE(ParserConfigKey));
  while (void *config = *configs++)
  {
      c->next = configData;
      configData = c;
  }

  // ...
}

(Similarly, the allocator compartment could have a nearly identical loop for assigning owner IDs to quotas.)

Edit 2025/01/24: David reminds me that my initial design, of having type-segregated sealed object sets, is required, not just a nice to have. I've updated the above.

I think, given the possibility of the above as an extension to the design sketched above, that we ponder exploring that more fully first and then pick this up in a 2nd round?

Footnotes

  1. Sadly, though, I don't think we can reuse the memory of these initialization functions like we can with the loader proper, unless we either

    1. generate a completely separate code+rodata segment for the initialization code, or
    2. end up moving to an ABI with a code/rodata/rwdata split.

    Even then, in either case, we'd have to be careful that the things we called ended up in the initialization .text segment, possibly duplicating some of the persistent .text segment. Infinite future, perhaps.

  2. Potentially, we could also allow storing (library call) sentries, pointers to shared objects, and/or (sealed!) pointers to other static sealed objects. I don't have much of a use case in mind for those, but I think it'd be safe to do.

  3. read-only for non-interference; shallow-non-capture because we'd like to reuse the memory for the set as part of the heap; not deep-non-capture, because we want to allow the immortal static objects to be captured (as by the configData pointer in the example).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants