-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
need test harness for virtual/durable object restoration #4997
Comments
As a client of the virtual object system, what I care about is verifying that objects that get paged out and back in have all the necessary state. For this purpose, it doesn't matter wether the object was actually paged out due to inactivity. Can you provide a test harness that will take a virtual object as an argument, and give back another one that contains what it would hold if paged back in? Is "getting a copy of the object in the same address space to examine" an incoherent concept? |
No, no, that's a coherent concept. It's just something that liveslots tries to make impossible to observe. Let's see, the API of this would need to be something like: const c = makeScalarBigMapStore();
const root = Far('iface', {
one: () => {
c.init('key', makeFoo(args));
},
two: () => {
const foo = c.get('key');
testFoo();
},
}); and then something to drive it that somehow guarantees that a full GC flush happens between calling Hm, so to make sure the test is actually testing the right thing, you'd like an extra flag or query or something to ask "how many times have you made a Representative for this current+live Representative"? If that reported only 1, your test is passing without really exercising the rebuild. Or a debug mode thing that said "give me a new Representative, I don't care if there's one out there already", but that's exactly what liveslots is trying very hard to prevent. Let's see, a pair of debug tools like: const c = makeScalarBigMapStore();
function populate() {
c.init('key', makeFoo(args));
}
async function test() {
populate();
await TOOL.forceGC();
const foo = c.get('key');
assert(TOOL.getCreationCount(foo) > 1, 'test cannot test properly');
testFoo();
} We could add extra debug stuff to @FUDCo do you have any clever ideas on how to pull this off? I'm guessing that one of the |
@FUDCo Sounds like you just volunteered for this, or at least to build some tools that will obviate the need to build a full-fledged version of this. |
I had a thought, it's probably entirely subsumed by @FUDCo 's plans, but I'll mention it in case it inspires anything. Restoration (and upgrade) is about dropping an old Representative and acquiring a new Representative (possibly with different behavior). The purely-restoration flow nominally looks like: {
const makeFoo = defineKind(behaviorVersion1..);
const fooRepresentative1 = makeFoo(args);
durableMap.init(key, fooRepresentative1);
}
// time passes, GC happens
{
const fooRepresentative2 = durableMap.get(key);
// use fooRepresentative2
} and the upgrade flow looks like: {
const makeFoo = defineDurableKind(handle, behaviorVersion1..);
const fooRepresentative1 = makeFoo(args);
durableMap.init(key, fooRepresentative1);
}
// time passes, GC happens, vat upgrade happens
{
const makeFooV2 = defineDurableKind(handle, behaviorVersion2..);
const fooRepresentative2 = durableMap.get(key);
// use fooRepresentative2
} Now technically we could allow the second flow without a whole vat upgrade:
The only time that we know there are zero Representatives, in a way that doesn't expose GC sensitivity, is immediately after upgrade, so that's currently the only time we allow Kind behavior to be "changed" (i.e. defined for the first time in a given version). But unit tests don't need the same protections: we don't need to guard against GC sensitivity (they already have the full start compartment WeakRef/etc). And we could also allow vats to observe multiple Representatives of the same object (if they ask for it, with some special test-only function). So we could provide a function in the test environment (never exposed to normal vats) that clears the tables which normally ensure the "zero or one Representatives at a time" rule, and then the next {
const makeFoo = defineKind(behaviorVersion1..);
const fooRepresentative1 = makeFoo(args);
durableMap.init(key, fooRepresentative1);
}
// test case flushes VOM tables
testThing.clearVirtualObjects();
{
const fooRepresentative2 = durableMap.get(key);
// use fooRepresentative2, definitely distinct from fooRepresentative1
} We could even allow them to re-define the behavior first, and then the new rep would behave the new way. {
const makeFoo = defineDurableKind(handle, behaviorVersion1..);
const fooRepresentative1 = makeFoo(args);
durableMap.init(key, fooRepresentative1);
}
// test case flushes VOM tables
testThing.clearVirtualObjects(); // also durables
{
// we're still in the one version, now redefine the live Kind
const makeFooV2 = defineDurableKind(handle, behaviorVersion2..);
const fooRepresentative2 = durableMap.get(key);
// use fooRepresentative2, has new behavior
} It might be safer to rig all Representatives to self-destruct when this function is invoked, to help the authors of these tests discover mistakes, like continuing to use the old representative even though they thought they'd stopped. Maybe we make |
An insistence on having only one behavior present at a time for each kind seems inconsistent with best practices as I learned them at Each (formal or informal) API has a client and server end. It's often the case that the software at the two ends gets updated at different times, so one or the other has to be prepared to talk to at least two versions. (If you can't force upgrades, there may be more than two. You have to support at least two by the hypothesis that new software isn't available to both sides simultaneously.) It is often the case that the server end is easier to upgrade, so I'll write as if the higher velocity happens there. What you end up having to do is writing software that attempts to migrate in parallel. Since they can't be released in parallel, the server side goes first, and adds new methods or new parameters, but continues to support all old calls. When the new client is released, it can call the new methods, or use new calling conventions. The server should expect to be speaking to a mix of old and new clients. Once you can be sure that no old clients remain, you can release another version of the server code that drops support for the now-unused old calls. This approach worked when the client ran in a browser, but was developed by a team with a distinct release schedule and when the client was an app that had to be released through two different app stores. This approach means that the client only needs to support one server version per release, but in practice, we found that we had version flags in the code on both sides because scheduling software releases is an imprecise art. If server-side development of feature A misses its deadline, but feature B is ready, and has been promised to someone, you really want to be able to release the client software on schedule but with the switch to B enabled and A disabled. I suspect we can't expect to upgrade all infrastructure, contracts, and clients simultaneously, so I think we need to be prepared to live in a distributed version migration world. |
I'm not sure I see where the inconsistency lies. Indeed the API exposed by the object must be stable enough that the consumer's expectations are not broken through an upgrade. However that doesn't prevent the object from upgrading its implementation and having only one copy of that implementation running at any given time. I don't think building a system where a consumer interacts with a specific implementation until it itself upgrades is scalable, or even wanted because the contract exporting objects needs the ability to fix bugs. That basically puts the burden on the object to satisfy the ancestor's API contract, possibly forever. There is also always to possibility of adding new objects, with a new API. |
* feat: first pass at VO and DVO test harnesses Closes #4997 * docs: add documentation * test: a virtual object test for AMM pools. see: #5207 Pretty rudimentary. It just sees that the objects come back. The test fails when I manually remove (tested) stuff from state. There's one test that changed state is persisted: I update the notifier, and verify that on revival, the count has increased. A plausible next test would be to add liquidity, but that's a bunch more work. * chore: review-inspired cleanups. Co-authored-by: Chip Morningstar <chip@fudco.com>
What is the Problem Being Solved?
@Chris-Hibbert pointed out that we need a good way to test that a userspace (contract) which uses virtual/durable Kinds will correctly reload those objects after getting paged out and paged back in (or after the entire vat gets upgraded).
Instances of a virtual/durable Kind are presented to JavaScript code by a "Representative", much like how a "Presence" object is the way JS interacts with a remote object in some other vat. Each Representative may or may not have its state data in RAM: there is an LRU cache which limits how many objects' data will be loaded at a time. When a method is invoked on a Representative, its state data is loaded (or moved to the front of the cache), and this might push some other object's data off the cache, which is what writes it to the DB. The dirty state records are also written to DB when a special "bringOutYourDead" delivery is made, which happens on a configurable schedule, somewhere between once after every normal delivery, once every N deliveries, and once at the end of each block.
The Representative itself is held in memory either by the userspace code that holds it, or by the LRU cache. Once it is not held by either, the next time GC happens (either "organically" because XS needs a new
struct slot
and the free list is empty, or whenbringOutYourDead
forces a GC sweep).By design, normal vat userspace cannot reliably force an object out of memory, nor can it sense when one has been forced out. If you drop your RAM references to the Representative, then allocate or interact with a whole bunch of other objects, you might be able to both flood the LRU and trigger organic GC. The next time you try to get a Representative (e.g. reading it from a property of some other virtual object, fetching it from a collection, or receiving it in a message from a different vat), the manager will build a new one for you. This is the point at which the Kind's registered
behavior
record is used, to create astate
object and a new batch of methods to wrap it, which collectively make up the Representative(s).So the low-level requirement is for some sort of test harness to allow this "rebuild a Representative" process to be exercised.
I'm not yet sure how to do this. If virtual objects work correctly, there should be no way for userspace to mess up (if it could, userspace could probably cause a consensus failure or state divergence). So in once sense, userspace authors don't need to test this; swingset authors do.
There is a related higher-level requirement here: contract upgrade means vat upgrade, and the version-1 code must be designed to store enough data in durable objects/collections to enable version-2 to fulfill its obligations to the existing clients. That sort of "restore" is very much user-visible. We need a test harness for this too.
Description of the Design
Security Considerations
Test Plan
The text was updated successfully, but these errors were encountered: