-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: relocate snapshot metadata from kvStore to snapStore #6781
Conversation
21617b8
to
e88d84a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good: some small API/name recommendations, and one query/index change to consider. Provisional r+ assuming those changes.
* deleteVatSnapshots: (vatID: string) => void, | ||
* deleteUnusedSnapshots: () => void, | ||
* deleteSnapshotByHash: (hash: string, vatID: string) => void, | ||
* getSnapshotInfo: (vatID: string) => SnapshotInfo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API name suggestions: to make it more clear whether a method touches all-vs-just-one vat, and/or all-vs-just-one snapshot, and to somewhat factor out the word "snapshot" (since the call sites will all be like snapstore.deleteXXX
), I'm thinking:
- (remove
has
, it doesn't make sense any more, at least not without a VatID) - deleteAllUnused (the only non-VatID-specific method)
- (I'm not positive we need an API for this: callers could just wait for the first new
save()
to trigger deletion of all the old ones, and/or they could run a manual/usr/bin/sqlite3
command if they're impatient)
- (I'm not positive we need an API for this: callers could just wait for the first new
- loadLatest
- save
- deleteAllForVat
- deleteVatSnapshotByHash
- getLatestInfo
maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has
is used in tests so it needs to stay in, though a case can be made that it should acquire a vatID parameter.
In a similar vein, deleteUnusedSnapshots
and deleteSnapshotByHash
are currently only used by tests, though one can imagine future maintenance or debugging tools that might use them. I'm very disinclined to fall back on using the sqlite command line client to do things like clear out unused snapshots, because that implies a level of understanding of the schema and a level of care using the sqlite tool that I wouldn't want to count on operators (or even myself after stepping away from this code for a few weeks) to necessarily have.
I'm a bit wishy-washy about getting rid of the word "snapshot" in the method names, given folks' propensity to use destructuring on API objects, but your comment does point out a naming inconsistency in that some method names have it and some don't -- if we want to have the word "snapshot" in the names, then for example save
should probably be saveSnapshot
. And putting "snapshot" in all of them definitely seems silly. So, yeah, we can clean that up. However, all of the non-maintenance, non-test operations (i.e., the ones that are used normally and ubiquitously) are per-vat and operate on the latest snapshot, so by the same reasoning I don't think putting "vat" or "latest" in the names helps.
It also strikes me that it's probably good to be consistent about parameter order -- the large majority of methods that take a vatID parameter should have it in the same (first) position, which is not currently universal.
So I think the main API ends up as:
load(vatID, loadRaw)
(unchanged)save(vatID, endPos, saveRaw)
(unchanged)getInfo(vatID)
(wasgetSnapshotInfo
)deleteAll(vatID)
(wasdeleteVatSnapshots
)
with testing/debug/maintenance methods:
hasHash(vatID, hash)
(washas(hash)
)deleteByHash(vatID, hash)
(wasdeleteSnapshotByHash(hash, vatID)
)deleteAllUnused()
(wasdeleteUnusedSnapshots
)
How does that all work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After making the above changes to see what it looked/felt like in the code, I'm reconsidering the names. Part of the problem is that the API object is called "snapStore" and not "snapshotStore". Perhaps we might want to change that, but I'm leaning towards mostly keeping the old/existing names. I've found that naked getInfo
et al looks strange in the actual use context. Now I'm leaning towards:
loadSnapshot(vatID, loadRaw)
(wasload
)saveSnapshot(vatID, endPos, saveRaw)
(wassave
)getSnapshotInfo(vatID)
(now unchanged)deleteSnapshots(vatID)
(now unchanged
and then:
hasHash(vatID, hash)
(washasHash(hash)
-- keep the added vatID parameter from the above proposal)deleteSnapshotByHash(vatID, has)
(was same name but make the parameter order consistent)deleteUnusedSnapshots
(now unchanged)
Now what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, although deleteUnusedSnapshots
could be deleteAllUnusedSnapshots
to highlight the fact that it doesn't take a vatID
argument
@@ -98,16 +110,14 @@ export function makeSnapStore( | |||
) { | |||
db.exec(` | |||
CREATE TABLE IF NOT EXISTS snapshots ( | |||
vatID TEXT, | |||
endPos INTEGER, | |||
inUse INTEGER, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: used
would be easier to type when running a manual /usr/bin/sqlite3
query.. if it's easy to avoid camelCase by using just one word instead of two (and if the single word is just as readable as the pair), I usually try to do so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, how hard would it be to add an uncompressedSize
column? I can see us wanting some telemetry around that value. The compressed size should be retrievable from SQL directly with length(compressedSnapshot)
, I think (haven't tested it yet), but that wouldn't help with the uncompressed size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added uncompressedSize
. I'm wondering if I should also add compressedSize
to the table while I'm at it, since it would allow this to be queried without requiring fetching the whole (possibly very large) snapshot blob.
I don't like used
because it has a past tense meaning ("was used"), whereas what we really want to indicate "is in use right now".
const sqlStopUsingLastSnapshot = db.prepare(` | ||
UPDATE snapshots | ||
SET inUse = 0 | ||
WHERE vatID = ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we're deferring indexes and general perf questions until later, but this jumps out at me as doing an O(N) operation when really there will only ever be a single row that needs updating.
If the user has configured their node to automatically delete all snapshots, the query will be O(1) (simply because there won't be any other rows to examine), but if they're archiving for whatever reason, then this query will be more expensive than it really needs to be.
Some other options would be:
- add an index on
(vatID, inUse)
, make the statementWHERE vatID = ? AND inUse = 0
, to clear the flag on only the rows where the flag was already set- or, make the index on
(inUse, vatID)
, which would accomplish the same thing, but also make a big delete-all-unused-snapshots operation cheaper (because theinUse
column appears on the MSB end of the index, and the operation would beDELETE FROM snapshots WHERE inUse = 0
)
- or, make the index on
- do a
sqlGetSnapshotInfo
-like read query first, to learn theendPos
of the latest row, and then useWHERE vatID = ? AND endPos = ?
to clear the flag on only the most recent snapshot
I think I'm more in favor of the inUse, vatID
index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding an inUse
qualifier to the WHERE
clause on the query should take care of this, i.e. change the SQL to UPDATE snapshots SET inUse = 0 WHERE inUse = 1 and vatID = ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, DELETE FROM snapshots WHERE inUse = 0
is already how the delete-all-unused operation is implemented.
In general, I think we should hold off adding explicit indexes to the database schema until data tells us we need them.
const snapshotInfo = vatKeeper.getSnapshotInfo(); | ||
if (snapshotInfo) { | ||
const { hash, endPos } = snapshotInfo; | ||
kernelSlog.write({ type: 'heap-snapshot-load', vatID, hash, endPos }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this slog entry uses hash
, while the heap-snapshot-save
one uses snapshotID
.. we need to make them consistent. I think I prefer using snapshotID
rather than hash
. You might consider making the contents of snapshotInfo
use snapshotID
instead of hash
(and avoiding variables named hash
) to make that consistency easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite prefer hash
, since really there's no longer any such thing as a "snapshot ID". The hash is not being used to identify the snapshot but is instead being used more like a checksum to validate that it consists of the bits that it's supposed to consist of. However, being consistent seems right; I think the proper thing is to excise snapshotID
from the few remaining places it appears. Having 'hash' in the slog entry says "I saved/loaded a snapshot and its hash was blah" - i.e., the hash is now just more descriptive metadata.
9ce8a36
to
f06a171
Compare
e88d84a
to
b843da9
Compare
b843da9
to
0da5f4b
Compare
Completes #6742