Add acceptance queue #4638

mhofman · 2022-02-22T21:14:54Z

Best reviewed commit by commit

Description

This is a preliminary refactor splitting the run-queue in 2 before #3465. All send, resolve and subscribe syscalls, as well as kernel initiated deliveries are first enqueued to an "acceptance" queue, which is drained into the usual "run" queue in its own cranks.

The next step (follow-up PR) is to change the handling of the resolve and subscribe syscalls so that they are enqueued onto the acceptance queue first instead of mutating promise state immediately (except for reference counting). Processing those and send events from the acceptance queue would then be as follow:

message send onto promises are enqueued onto the promise queue or run-queue, depending on the state of the promise, instead of going through the run-queue as-is (see syscall.send to resolved promise should run-queue to object, not to promise #4542). send onto objects are simply moved to the run-queue
resolve would queue notifications on the run-queue for subscribers, and move send from the promise queue to the run-queue
subscribe would either add the vat to the promise subscriber list, or enqueue a notify back to the vat, depending on the promise state

The goal is to only have messages with a known destination vat pending on the run-queue, so that the run (aka delivery) and acceptance queues can be split per vat according to #3465 (comment)

The maybeProcessNextMessage is awkward and a bit of a temporary hack. I'll rethink handling of messages once processing the acceptance queue does more than simply re-queueing the message.

This change also introduces an emptyCrank call to the run-policy. It was already possible to have empty cranks (that did no delivery before), e.g. in the case of a send queued on an unresolved promise. With acceptance queue processing this is now much more common.

This PR does not change anything to vat termination, which will still be processed at the end of the current crank. Currently the delivery to vat-admin informing of the vat termination is enqueued, like every other delivery generated from the kernel, onto the acceptance queue, thus after any pending syscall events generated by the exited vat. In the future with queues per vat, we can imagine that kernel initiated deliveries use their own acceptance queue, or are queued directly onto the target's queue.

This PR also contains a fix for a test inconsistency described in https://github.com/Agoric/agoric-sdk/pull/4575/files#r809781876

Security Considerations

None

Documentation Considerations

This change should only impact the number of cranks necessary to process messages, but nothing else.

Testing Considerations

Tests updated to account for different amount of cranks, and the new state of the queue, but no specific tests were added since the existing tests seem to cover the updated code already.

FUDCo

I can't make heads or tails of this with something called "syscall queue". I tried just treating it as an arbitrary, meaningless label like "cromfelter" but the cognitive interference with the actual words overwhelmed me and I just couldn't pull it off. Please either rename the abstraction or provide a deeper analysis to help me understand why that name makes any kind of sense. In particular, I don't understand the demarcation that the second queue is trying to capture.

FUDCo · 2022-02-23T07:34:04Z

packages/SwingSet/src/kernel/initializeKernel.js

@@ -89,7 +89,7 @@ export function initializeKernel(config, hostStorage, verbose = false) {
      vatKeeper.setSourceAndOptions({ bundleID }, creationOptions);
      vatKeeper.initializeReapCountdown(creationOptions.reapInterval);
      if (!creationOptions.enableSetup) {
-        kernelKeeper.addToRunQueue(harden({ type: 'startVat', vatID }));
+        kernelKeeper.addToSyscallQueue(harden({ type: 'startVat', vatID }));


I totally don't understand the concept of "syscall queue", but maybe it's just a really bad name for whatever it actually is. The run queue is a queue of things that go into vats, i.e., things that will result in deliveries, or things that are done to vats.. Syscalls are things that come out of a vat or done by a vat. Syscalls are also synchronous, so the idea of a queue involving them makes no sense to me.

mhofman · 2022-02-23T14:55:00Z

Yes I struggle with the naming as well, but couldn't come up with anything better. Suggestions welcome.

In the discussion with @dtribble and @warner, and summarized in #3465 (comment), we were talking about inbound and outbound queues. However if you don't know the reference point, that terminology is even more confusing (the vat was the reference point, so inbound were pending message deliveries into the vat, and outbound were pending messages generated by the vat).

When I looked at syscalls, I realized there are conceptually 2 kinds of syscalls:

Synchronous operations which affect the state of a data structure for the usage of the vat, and potentially return a value dependent on previous synchronous syscalls. E.g. store and device operations.
Asynchronous operations which only affect data structures internal to the kernel, and for which the vat doesn't get an immediate result, or cannot sense the effect until another crank. E.g. send, resolve, subscribe.

The idea is to put all the asynchronous syscall messages as-is into a queue (to be named). The kernel then pops those messages from the queue and routes or process them appropriately. For a message send, the message may be put on the delivery queue or to a promise queue, depending on the type and state of the target. For a resolve, the promise state would be updated and notifications queued for delivery to subscribers. For a subscribe, either the subscribing vat would be added to the list of subscribers, or get a notification queued back, depending on the state of the promise.

For a single pair of queues, like in this PR, this does not really¹ change the order of message delivery. However once we switch to different queues per vat (#3465), and the kernel is free to process messages from vats independently from each other, then this allows limiting the effect a delivery has on the overall system, since asynchronous activity from a vat is queued and processed one at a time.

I actually considered also renaming the current runQueue as deliveryQueue to highlight it should (ultimately) only contain messages for a known target vat. That doesn't help with the naming of the other queue I currently named syscallQueue, but maybe "sorting" or "routing" queue would be better? In my mind, the delivery and syscall/sorting/routing queue collectively form the run queues.

Given that message send on promises currently go directly onto the runqueue before being processed, moving to an early queuing onto the promise queue for unresolved promise could delay the actual delivery since the message would be moved to the delivery queue only once the promise is resolved (and the target vat known). ↩

FUDCo · 2022-02-23T21:55:56Z

"Routing queue" is better, especially when paired with "delivery queue", but possibly we can do even better. That pairing at least hints at a functional demarcation. I remain a little concerned that whatever thought process generated the term "syscall queue" in the first place was at least partially confused about the conceptualization behind this and therefore what this code does might also be. I know that I'm certainly confused about it. You also used the phrase "syscall messages" in your comment, which makes no sense at all to me and suggests that there is something foundational here that is wrong. Probably this is a worthy agenda item for our next kernel meeting.

erights · 2022-02-23T23:42:28Z

I can't make heads or tails of this with something called "syscall queue". I tried just treating it as an arbitrary, meaningless label like "cromfelter" but the cognitive interference with the actual words overwhelmed me and I just couldn't pull it off. Please either rename the abstraction or provide a deeper analysis to help me understand why that name makes any kind of sense. In particular, I don't understand the demarcation that the second queue is trying to capture.

Btw, I saw this and wondered what a "cromfelter" might actually be. So I googled and the first hit was http://habitatchronicles.com/2018/01/pool-abuse/ . Now I know ;)

mhofman · 2022-02-24T15:17:24Z

Another issue arising is what to do regarding the exit syscall, in particular if send or resolve calls were made in the same crank.

In the case of a clean exit, I'm imagining we could process these previous syscalls first (not what this PR does), so can afford to delay the termination until a later crank. However in case of a failure, I'm not sure if we need the termination to be recorded in the same crank. I assume as long as all pending "syscalls" are processed, it doesn't matter if the vat is terminated or not?

mhofman · 2022-02-24T17:26:32Z

Renamed syscallQueue -> acceptanceQueue

warner · 2022-02-24T17:53:20Z

Another issue arising is what to do regarding the exit syscall, in particular if send or resolve calls were made in the same crank.

vatPowers.exit aborts the current delivery, unwinding all state changes, resetting to a world in which the vat has spontaneously terminated just before the delivery would have been made. The next thing the kernel does (with the current single run-queue) is to attempt the delivery again, which splats against the dead vat and gets rejected, precisely like any subsequent messages which are headed towards that same vat.

With a distinct vat-inbound queue, we can do the same unwinding, but then reject everything in the inbound queue immediately. No need to wait for a redelivery, which (given multiple queues) could take an unknown amount of time to be serviced.

The real question is what about messages that are already in the vat-outbound queue. The vat observed them to be sent successfully, and has no control over how the outbound queue is serviced, so if the kernel deletes the queue upon termination, basically some random portion of the sent message would get deleted. I don't see how programmers could reason about that.

But the alternative is to keep the vat-outbound queue around (occasionally servicing the remaining messages) even though the vat itself is terminated, which is pretty awkward. The vat lifecycle would grow from "non-existent -> active -> non-existent" to "non-existent -> active -> lingering-death -> non-existent". Maybe if we track vat-outbound queues in one data structure, and vats themselves in a separate one, it wouldn't be too hard to manage deleting the vat but retaining the queue until it is empty (and then deleting it).

mhofman · 2022-02-24T18:01:17Z

vatPowers.exit aborts the current delivery, unwinding all state changes, resetting to a world in which the vat has spontaneously terminated just before the delivery would have been made.

That is not my reading of

agoric-sdk/packages/SwingSet/src/kernel/kernel.js

Lines 309 to 318 in a1fcbeb

    
           // this is called for syscall.exit (shouldAbortCrank=false), and for any 
        
           // vat-fatal errors (shouldAbortCrank=true) 
        
           function setTerminationTrigger(vatID, shouldAbortCrank, shouldReject, info) { 
        
             if (shouldAbortCrank) { 
        
               assert(shouldReject); 
        
             } 
        
             if (!terminationTrigger || shouldAbortCrank) { 
        
               terminationTrigger = { vatID, shouldAbortCrank, shouldReject, info }; 
        
             } 
        
           }

FUDCo · 2022-02-24T20:45:40Z

vatPowers.exit aborts the current delivery, unwinding all state changes, resetting to a world in which the vat has spontaneously terminated just before the delivery would have been made.

That is not my reading of

agoric-sdk/packages/SwingSet/src/kernel/kernel.js

Lines 309 to 318 in a1fcbeb

// this is called for syscall.exit (shouldAbortCrank=false), and for any

// vat-fatal errors (shouldAbortCrank=true)

function setTerminationTrigger(vatID, shouldAbortCrank, shouldReject, info) {

if (shouldAbortCrank) {

assert(shouldReject);

}

if (!terminationTrigger || shouldAbortCrank) {

terminationTrigger = { vatID, shouldAbortCrank, shouldReject, info };

}

}

I think @mhofman is right on this. A non-aborting exit shuts down the vat after the state changes made during the crank are committed. This means that although the state of the vat itself goes away, any consequences to the kernel data structures (notably the runQueue) are retained, i.e., messages that were sent are actually sent.

However, I don't know that this introduces any new complications above and beyond the (non-trivial!) complications that have already been raised.

warner · 2022-02-24T23:28:57Z

Ah, ok, you're right, kernelSyscall.js exit() calls it with shouldAbortCrank = false. Vats doing vatPowers.exit are asking to self-terminate cleanly, committing all state changes from their current delivery. I think they should expect that all previously-sent messages (from earlier cranks) will be delivered, which means their output queue should outlive the vat.

Vats which somehow perform an illegal operation (or simply an infinite loop) and get terminated non-cleanly should not expect that any messages they sent within the same delivery will be committed, and unwinding the state takes care of that. The question of whether previous-cranks' messages get deleted or eventually delivered remains, and should have the same answer as for vatPowers.exit.

The same is true for vats that are killed externally, by adminNode~.terminateWithFailure(), but the vat in question doesn't know exactly when their parent will invoke that, making it even more random-looking to them (or to their ghost, I guess).

mhofman · 2022-02-25T00:27:03Z

Right, I think the summary is that this PR changes nothing to vat termination since the acceptance queue is simply the tail of the previous run-queue.

Vat exit will only become a question once we hold a queue pair per vat. At that point, to preserve existing delivery guarantees (message sent in previous cranks get delivered regardless if vat terminates in a later crank), the vat outbound queue (whatever has been committed to) needs to get drained after the vat is marked as terminated.

warner

Looks good, small changes suggested (and slightly-expanded test requested).

warner · 2022-02-25T00:04:59Z

packages/SwingSet/docs/run-policy.md


 All methods should return `true` if the kernel should keep running, or `false` if it should stop.

 The `computrons` argument may be `undefined` (e.g. if the crank was delivered to a non-`xs worker`-based vat, such as the comms vat). The policy should probably treat this as equivalent to some "typical" number of computrons.

 `crankFailed` indicates the vat suffered an error during crank delivery, such as a metering fault, memory allocation fault, or fatal syscall. We do not currently have a way to measure the computron usage of failed cranks (many of the error cases are signaled by the worker process exiting with a distinctive status code, which does not give it an opportunity to report back detailed metering data). The run policy should assume the worst.

+`emptyCrank` indicates the kernel processed a queued messages which didn't result in a delivery.


Half of me wants to not tell the runPolicy about these. I don't know how the host application author should react: how many are too many? I guess a basic approach would be to say "no more than 200 emptyCranks in a block". Would that protect against some sort of infinite cyclic loop or DoS attack?

Let's expand on the comment a bit, add another sentence or two about what situation can cause this (syscall.send to a rejected promise?). It seems like anything pulling from a vat-outbound-queue will never result in a delivery, so maybe this happens like 50% of the time, and if so the host app author should set their threshold a lot higher.

Hm, maybe we need to rename policy.crankComplete to deliveryComplete, change crankFailed to deliveryFailed, and then add a crankHappened which means the kernel did work but not necessarily a vat.

But, I'm ok with this PR landing without that change, if we make a ticket to improve it later.

I think being more explicit to which kind of crank happened would indeed be good. Probably makes sense to include as part of the next step when changing the processing of the acceptance queue.

warner · 2022-02-25T00:08:46Z

packages/SwingSet/src/kernel/initializeKernel.js

@@ -89,7 +89,7 @@ export function initializeKernel(config, hostStorage, verbose = false) {
      vatKeeper.setSourceAndOptions({ bundleID }, creationOptions);
      vatKeeper.initializeReapCountdown(creationOptions.reapInterval);
      if (!creationOptions.enableSetup) {
-        kernelKeeper.addToRunQueue(harden({ type: 'startVat', vatID }));
+        kernelKeeper.addToAcceptanceQueue(harden({ type: 'startVat', vatID }));


If I understand correctly that "acceptance queue" is the vat-outbound-queue (populated by vat syscalls), then I'd think startVat goes on the other one (the vat-inbound queue).

Also, it's critical that startVat gets delivered to a vat before anything else is delivered to that vat, regardless of whatever fairness sampling we might implement later, otherwise messages could arrive at an incomplete vat. Putting startVat on the vat-inbound queue should guarantee this.

Yes probably startVat could go straight to the runQueue. However I didn't want to change any message ordering in this PR, so I always enqueue everything at the end of the acceptanceQueue for now. Happy to change the potential order already in this case.

so I always enqueue everything at the end of the acceptanceQueue for now

Ah! That totally makes sense and explains a large amount of my confusion. Now I understand why I didn't understand what the demarcation criteria were.

packages/SwingSet/tools/vat.js

packages/SwingSet/test/test-kernel.js