You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#455 ("hierarchical object identifiers") is about moving voluminous state from RAM onto disk. The bottommost layer will be a new pair of syscalls to allow vats to perform synchronous reads/writes to a per-vat key-value store. This will be used by liveslots to implement the Container API (#1832).
Each vat will have exclusive access to a key-value store whose keys are strings (maybe integers) and whose values are initially strings (but will probably eventually be capdata).
The task here is to implement those two syscalls.
Description of the Design
The names are up for discussion, but I'll use syscall.readX and syscall.writeX for now. We need a name for this particular storage pool to use for the "X": we want to distinguish at least the following pools:
this one: mutable, exclusive to a single vat, indexed by short string or integer (which does not go through the c-list), contains (probably) capdata, syscalls perform add/write/update and read, may eventually store more structured data, add a list syscall with range queries or sort options to support e.g. finding the best matching offer among many
the blobstore: append-only, accessed by c-list -managed blobcaps, values are immutable large strings/bytes, shared among all vats, used as an alternative communication path for large data like vat/contract bundles, may have partial-range read functions. Operations include add and read, and maybe decref. Operations probably want to bypass the transcript (because they're large) and use vatPowers instead of syscalls.
others?
Simply value = syscall.read(key) and syscall.write(key, value) may be good for now.
I think the vat's data should be stored in the HostDB with keys like v$NN.data.$KEY.body. This reserves room for .slots to hold the capdata later.
Alternative Designs
secondary-storage device
If we first refined the device model (#55) to have distinct read and write calls, we might implement this ticket in terms of that model. However:
we'd still need to write the actual device
the device must record this state in the HostDB (with transaction/atomicity boundaries that match the rest of the kernel), which could be an awkward set of endowments to grant to the device
the device should be made available to many vats (I'd argue all of them), and users shouldn't have to update their bootstrap.js to distribute it to the vats which need it
expanding the functionality to include list, range queries, sort options, etc, might be accomplished by adding additional arguments to read (and changing the return value), or it might be better achieved by adding new syscalls
I'm pretty sure we need reads to observe preceeding writes. new device model: read/writeLater #55 suggests "writeLater", where writes aren't visible to read calls until after some outside-the-kernel "block" boundary (on the other hand I think new device model: read/writeLater #55 should really be read/write too)
I'm inclined to think it would be simple to add new syscalls, but I'm interested in other opinions. I believe @dtribble has suggested that a device is the obvious choice for this sort of functionality.
vatPowers instead of syscall
Using syscalls to fetch the numerous data means all of it will be recorded in the transcript. #451 is about removing data from RAM, and doesn't worry about how much is left on disk (or how much we're adding to disk to minimize our RAM footprint). If we wanted to try and minimize this too, we might have the secondary storage API go through vatPowers instead.
However, since vatPowers functions do not go through the transcript, we have no mechanism to make sure the vat sees the right values as the transcript is replayed (that's the whole point of the transcript). This could work for blobcaps, since their data is immutable, but this secondary storage is specifically for mutable data like Purse balances.
I think we're stuck with going through syscalls.
future: capdata
Our hunch is that the second use case (Zoe contracts tracking large numbers of offers/seats) will require the ability to store more than flat data in this table. We expect to store object references too (o-NN/etc), as well as the Representatives described in #455 (which means hierarchical o+NN.NN ids).
The syscalls would need to accept capdata ({ body, slots }, where body is a JSON-format string, and slots is an array of vat-side reference ids). The kernel would need to translate the slots through the vat's c-list just like it does with a syscall.send() call (for write), or dispatch.deliver() (for read). The liveslots-provided Container API's c.get() function would need to use the same marshal.unserialize as the rest of liveslots, and c.update would need to use marshal.serialize.
This marshal instance must know about the Container tables, so any Representatives are serialized properly. This will enable cross-table references (which would be "FOREIGN KEYs" in a SQL system).
I'm not sure whether this ought to include Promises, specifically resolved ones. When these appear in a syscall.send, the vat sends a new (unresolved) promise ID, then schedules a syscall.resolve* call to happen later during the same crank. The kernel does the same thing on the inbound side when it includes an unresolved promise in the arguments of a dispatch.deliver.
The problem here is that the secondary storage API is synchronous, so any promise that it retrieves from disk won't be resolved until later. Should the kernel watch for reads that mention promises in that state and inject resolution notifications for them to the back of the run-queue? Should the read API be enhanced to return capdata plus a list of resolutions for all the promises that happen to be in that state?
I think we need to learn more about the use-case for including Promises in these offline tables before we tackle the question of how to safely implement them. I'd recommend that the first pass use translators which reject promises, and only accept object IDs in the .slots list.
future: range queries, sort options, indices
Once #455 grows to include data like the offer book in a large exchange contract, we'll need more than a simple key-value store. At that point we'll need features like:
list the entire contents of the table
retrieve the subset of the table that meets some search criteria ("all open orders")
sort the results (eg by price)
limit the number of returned results
add an index to improve performance
I don't know what the API will need to look like yet.
Security Considerations
usage limits
limit number of keys, size of any individual value, aggregate size of all values.
should the vat be killed when it exceeds the limits? or merely have the syscall return an error?
budgets should eventually be managed like Meters, when spawning a new vat the parent can share some of its space budget with the child
if we're taking user-provided keys and interpolating them into KernelDB keys, we must be careful about format-confusion attacks, especially if we add enumeration with list
when we add list, we want to continue to enforce ocap discipline
calling list should not enable code to access an object that it couldn't have seen by other means
we'll need to define a "collection" object. Holding one grants access to all the Representatives that were previously added to the collection.
we don't expect to need rights-amplification patterns that involve two collections at the same time (no intersection queries)
we don't expect to need the collection objects themselves to be serializable or sharable. Objects in collection1 might reference objects in collection2, but they won't reference the collection2 object itself.
The text was updated successfully, but these errors were encountered:
What is the Problem Being Solved?
#455 ("hierarchical object identifiers") is about moving voluminous state from RAM onto disk. The bottommost layer will be a new pair of syscalls to allow vats to perform synchronous reads/writes to a per-vat key-value store. This will be used by liveslots to implement the Container API (#1832).
Each vat will have exclusive access to a key-value store whose keys are strings (maybe integers) and whose values are initially strings (but will probably eventually be capdata).
The task here is to implement those two syscalls.
Description of the Design
The names are up for discussion, but I'll use
syscall.readX
andsyscall.writeX
for now. We need a name for this particular storage pool to use for the "X": we want to distinguish at least the following pools:add/write/update
andread
, may eventually store more structured data, add alist
syscall with range queries or sort options to support e.g. finding the best matching offer among manyadd
andread
, and maybedecref
. Operations probably want to bypass the transcript (because they're large) and usevatPowers
instead of syscalls.Simply
value = syscall.read(key)
andsyscall.write(key, value)
may be good for now.I think the vat's data should be stored in the HostDB with keys like
v$NN.data.$KEY.body
. This reserves room for.slots
to hold the capdata later.Alternative Designs
secondary-storage device
If we first refined the device model (#55) to have distinct
read
andwrite
calls, we might implement this ticket in terms of that model. However:bootstrap.js
to distribute it to the vats which need itlist
, range queries, sort options, etc, might be accomplished by adding additional arguments toread
(and changing the return value), or it might be better achieved by adding new syscallsread/write
too)I'm inclined to think it would be simple to add new syscalls, but I'm interested in other opinions. I believe @dtribble has suggested that a device is the obvious choice for this sort of functionality.
vatPowers instead of syscall
Using syscalls to fetch the numerous data means all of it will be recorded in the transcript. #451 is about removing data from RAM, and doesn't worry about how much is left on disk (or how much we're adding to disk to minimize our RAM footprint). If we wanted to try and minimize this too, we might have the secondary storage API go through
vatPowers
instead.However, since
vatPowers
functions do not go through the transcript, we have no mechanism to make sure the vat sees the right values as the transcript is replayed (that's the whole point of the transcript). This could work for blobcaps, since their data is immutable, but this secondary storage is specifically for mutable data like Purse balances.I think we're stuck with going through syscalls.
future: capdata
Our hunch is that the second use case (Zoe contracts tracking large numbers of offers/seats) will require the ability to store more than flat data in this table. We expect to store object references too (
o-NN
/etc), as well as the Representatives described in #455 (which means hierarchicalo+NN.NN
ids).The syscalls would need to accept capdata (
{ body, slots }
, wherebody
is a JSON-format string, andslots
is an array of vat-side reference ids). The kernel would need to translate the slots through the vat's c-list just like it does with asyscall.send()
call (forwrite
), ordispatch.deliver()
(forread
). The liveslots-provided Container API'sc.get()
function would need to use the samemarshal.unserialize
as the rest of liveslots, andc.update
would need to usemarshal.serialize
.This
marshal
instance must know about the Container tables, so any Representatives are serialized properly. This will enable cross-table references (which would be "FOREIGN KEYs" in a SQL system).I'm not sure whether this ought to include Promises, specifically resolved ones. When these appear in a
syscall.send
, the vat sends a new (unresolved) promise ID, then schedules asyscall.resolve*
call to happen later during the same crank. The kernel does the same thing on the inbound side when it includes an unresolved promise in the arguments of adispatch.deliver
.The problem here is that the secondary storage API is synchronous, so any promise that it retrieves from disk won't be resolved until later. Should the kernel watch for reads that mention promises in that state and inject resolution notifications for them to the back of the run-queue? Should the
read
API be enhanced to return capdata plus a list of resolutions for all the promises that happen to be in that state?I think we need to learn more about the use-case for including Promises in these offline tables before we tackle the question of how to safely implement them. I'd recommend that the first pass use translators which reject promises, and only accept object IDs in the
.slots
list.future: range queries, sort options, indices
Once #455 grows to include data like the offer book in a large exchange contract, we'll need more than a simple key-value store. At that point we'll need features like:
I don't know what the API will need to look like yet.
Security Considerations
list
list
, we want to continue to enforce ocap disciplinelist
should not enable code to access an object that it couldn't have seen by other meansThe text was updated successfully, but these errors were encountered: