Snapshot API #47

juliangruber · 2013-05-13T06:57:45Z

What about this API for snapshots

db#snapshot()

Create a new snapshot.

snapshot#get(key, value[, opts])

Read a single value from a snapshot.

snapshot#create{Read,Key,Value}Stream

Read multiple values from a snapshot.

snapshot#dispose()

Delete the snapshot.

Maybe also:

snapshot#{del,createWriteStream,...}

Just forwarded to db#*

Usage

For a consistend read stream this would be:

var snapshot() = db.snapshot();
snapshot.createReadStream()
  .pipe(...)
  .on('close', shapshot.dispose.bind(snapshot));

The text was updated successfully, but these errors were encountered:

rvagg · 2013-05-13T07:06:47Z

nice creative thinking here, I was thinking more in terms of a token to be applied to the root db for read operations but this looks quite nice.

juliangruber · 2013-05-13T07:09:13Z

This makes it easy to do multiple operations on a snapshot but for just one operations it introduces some overhead, comparing the example from above to

db.createReadStream({ snapshot : true }).pipe(...)

rvagg · 2013-05-13T07:11:40Z

Yeah, but { snapshot : true } is unnecessary, every time you create a LevelDB iterator you create a snapshot that gets discarded when you finish, so you're using snapshots even though you don't see them. Snapshots in the sense we're talking about here are for longer-lived operations, where multiple read types (get & iterator) can reference the same point-in-time.

I was thinking more:

var snapshot = db.createSnapshot()
db.createReadStream({ snapshot : snapshot }).pipe(...)
snapshot.dispose()

juliangruber · 2013-05-13T07:52:07Z

Oh, true.

What I like about the syntax I proposed is that it

extends no options objects (which should stay as small as possible)
just swaps db with snapshot and all the APIs stay the same

juliangruber · 2013-05-13T07:53:40Z

Bad about my syntax:

Plugins that overwrite methods need to do it in 2 places, or just use hooks.

But, I think we need more hooks libraries anyways, as only they should do db.get = function() {...}. With them in place there's no problem here.

rvagg · 2013-05-13T07:59:20Z

ahh, perhaps we could do both approaches - i.e. hide the fact that it uses an options object on createReadStream() and get() to present your API. That way, plugins mostly won't need to worry since they will be routed back through the standard calls:

snapshot.createReadStream() === db.createReadStream({ snapshot: snapshot })
snapshot.get('foo', cb) === db.get('foo', { snapshot: snapshot }, cb)

We wouldn't even necessarily need to document the option.

juliangruber · 2013-05-13T09:59:59Z

Nice! Let's do it like this.

Does LevelDOWN expose snapshots already?

rvagg · 2013-05-13T10:09:41Z

there's a tiny bit of snapshot code in there, not much and it's not tested.

mcollina · 2013-05-13T10:11:34Z

Are you sure we should expose the del, put, batch and createWriteStreammethods in the snapshots? The snapshot will totally ignore their changes
The following example shows the problem:

var snap = db.snapshot();
snap.put("a", "b", function() {
  snap.get("a", function(err, result) {
    // unfortunately, the key will not exists on the snapshot
  });
  db.get("a", function(err, result) {
    // but it will exists here!!
  });
});

juliangruber · 2013-05-13T10:24:28Z

Yeah, that kinda sucks.

I was thinking to store new state in memory, so snap.get('a') would return a value from memory...But, no, that's going to cause problems.

On the other hand, it could be nice to do plugin(snapshot) instead of plugin(db) and it just works...Can't think of a real use case though...

The let's just do the reading methods for now.

mcollina · 2013-05-13T10:29:37Z

If you implement it with the {snapshot: snap } option, any plugin will work by attaching itself to the main db.

juliangruber · 2013-05-13T10:52:56Z

Hm true, any plugin that will need snapshots will just have to use them. I was thinking about a plugin that would read 2 values and store 1. If it didn't have to be consistent you could call it with plugin(db) but if it had do, it would be plugin(snap).

Anyways, there's no real use case for this. So,

expose snapshots in LevelDOWN
add snapshot option to all methods that read something
add snapshot sugar with fluent api

kesla · 2013-05-13T11:07:15Z

I've been lagging a bit behind descussion about snapshots, so I'd like to know what the real use case for snapshots are to begin with? I'd say that we're in the risk of bloating levelup if we add snapshots only because we can...

dominictarr · 2013-05-13T11:26:06Z

I can't think of any uses for snapshots either, hmm, except providing a consistent view of data while you perform some computation -- but what is the point of consistent, if it's not the current data?

Anyway, on aesthetic grounds, I agree with @mcollina that snapshots shouldn't have any write methods,
also, if the db emitted a snapshot event, then plugins could listen on this, and apply themselves to the snapshot also.

But most importantly - what do we need snapshots for?

Oh, I remember one thing: I had some problems with createReadStream and then syncly putting data, and having it turn up in the read stream... i.e. a normal createReadStream should not return data that was inserted after it was created...

rvagg · 2013-05-13T11:31:19Z

ref #128 for that last problem, there's some discussion there that covers some relevant ground

mcollina · 2013-05-13T11:40:07Z

If we put the snapshosts in LevelDown, then also MemDown and Level.js should support them.
Is this feasible?

rvagg · 2013-05-13T11:45:22Z

No, I'm not going to implement it in MemDOWN (unless some clever bean submits a pull request). I don't know if IndexedDB exposes a usable API for this either.

Relevant thoughts on this in this comment: Level/levelup#129 (comment)

LevelUP is becoming more of a generic API to a storage architecture so it's important to be careful with API bloat here. As of 0.9, you'll be able to directly refer to a 'db' property of your LevelUP instance which is the underlying LevelDOWN compatible instance. I'd be happy to remove approximateSize(), repair() and destroy() from LevelUP and just say that they can be accessed from LevelDOWN which can be require()d for the static functions and referred to on 'db' for approximateSize() (mydb.db.approximateSize(...)).

Same thoughts apply here, it appears that there could be something of an emerging suspicion that the snapshot API may not be "core enough" to expose here, it would be easy enough to just expose it in LevelDOWN and it'll still be usable; we'd just have to forgo @juliangruber's nice API--but that could just as easily become a plugin.

juliangruber · 2013-05-13T12:14:47Z

Agree.

We could make it part of a extended abstractLevelDOWN test suite and each implementation could refer to that in the README.

fritzy · 2013-05-14T06:48:14Z

Use case: could a snapshot be used to read keys, create an atomic batch determined from those keys and use an optimistic lock? Say, if the snapshot is no longer valid for the read keys, fail the batch.

fritzy · 2013-05-14T06:53:05Z

Essentially, you could use snapshots to implement optimistic lock transactions.

I found an example of what I meant.
http://codeofrob.com/entries/writing-a-transaction-manager-on-top-of-leveldb.html

dominictarr · 2013-05-14T09:19:57Z

oh, right!

Yes, that would work great!

if there was a way to check if the snapshot was invalid!

you could do this with a bloom filter also, a snapshot creates a bloom filter, but also insert any puts into the bloom filter.
if there is a collision, (because you read a key, and then it was updated) then abort the transaction.

mirkokiefer · 2013-06-27T17:03:15Z

Has anyone worked on getting snapshots exposed?
I think there are definitely use-cases where you need support for atomic batch reads.

rvagg · 2013-06-27T17:06:14Z

They are partially implemented in LevelDOWN, from long ago, waiting for someone to either apply enough pressure to get it done because they have a compelling use-case or just do it themselves!

dominictarr · 2013-07-01T17:32:20Z

@mirkokiefer can you describe your usecase? we are having trouble thinking of a compelling usecase for this, so if you can give us one, that would be very helpful!

rvagg · 2013-07-01T17:47:31Z

Perhaps I should run a bountysource for this. If enough people want it and are willing to demonstrate that with $$ the I could probably crank it out. I just don't have a need for it right now and like @dominictarr said, I haven't seen any compelling use-cases that convince me that it's the next thing I should spend my limited time on.

mirkokiefer · 2013-07-01T21:34:58Z

@dominictarr I thought I had a usecase with having to iterate over a snapshot of entries while new entries are written. But readonly iterators do in fact already guarantee that they operate on a snapshot of the data - which is pretty cool :)

At the moment, this is a documentation-only PR, acting as an RFC. In particular I'd like review of my choice to use a token-based approach (as first suggested by Rod Vagg in Level/community#45) as opposed to a dedicated snapshot API surface (as suggested by Julian Gruber in Level/community#47). Main reasons for not choosing the latter: - It would be a third API surface. We have the main database API, the sublevel API which is equal to that except for implementation- specific additional methods, and would now add another API which only has read methods (get, iterator, etc, but not put and batch). If the API was reusable, meaning you could pass around a snapshot as if it was a regular database (like you can with sublevels) then I'd be cool with it. But for that to happen, we'd have to implement transactions as well, which I consider to be out of scope although transactions are in fact a use case of snapshots. - It would have a higher complexity once you factor in sublevels. I.e. to make `db.sublevel().snapshot().get(key)` read from the snapshot but also prefix the given key. By instead doing `db.sublevel().get(key, { snapshot })`, the sublevel can just forward that snapshot option to its parent database. - Furthermore, with the token approach, you can pass a snapshot to multiple sublevels, which cleanly solves the main use case of retrieving data from an index (I included an example in the docs). I renamed the existing snapshot mechanism to "implicit snapshots" and attempted to clarify the behavior of those as well. If accepted, some related issues can be closed, because this PR: - Includes (a more complete write-up than) Level/classic-level#40. - Mentions Level/leveldown#796 (which should be moved rather than closed). - Answers Level/classic-level#28. - Solves the main use case as described in Level/leveldown#486. - Effectively completes Level/awesome#19.

vweevers · 2022-09-26T07:59:59Z

Closing in favor of the approach outlined in #45. Tracked in #118.

ralphtheninja closed this as completed Dec 17, 2018

ralphtheninja reopened this Dec 18, 2018

ralphtheninja transferred this issue from Level/levelup Dec 18, 2018

vweevers added the discussion Discussion label Jan 1, 2019

vweevers added the enhancement New feature or request label Jan 1, 2019

vweevers mentioned this issue Mar 21, 2020

Implementing transaction on top of LevelUp? Level/levelup#691

Closed

vweevers mentioned this issue Jul 17, 2021

Insert Lock, Read Lock, Dead Lock Concatency Control Level/level#202

Closed

This was referenced Sep 25, 2022

Define explicit snapshots Level/abstract-level#42

Draft

Tracking issue: implicit and explicit snapshots #118

Open

vweevers closed this as completed Sep 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snapshot API #47

Snapshot API #47

juliangruber commented May 13, 2013

rvagg commented May 13, 2013

juliangruber commented May 13, 2013

rvagg commented May 13, 2013

juliangruber commented May 13, 2013

juliangruber commented May 13, 2013

rvagg commented May 13, 2013

juliangruber commented May 13, 2013

rvagg commented May 13, 2013

mcollina commented May 13, 2013

juliangruber commented May 13, 2013

mcollina commented May 13, 2013

juliangruber commented May 13, 2013

kesla commented May 13, 2013

dominictarr commented May 13, 2013

rvagg commented May 13, 2013

mcollina commented May 13, 2013

rvagg commented May 13, 2013

juliangruber commented May 13, 2013

fritzy commented May 14, 2013

fritzy commented May 14, 2013

dominictarr commented May 14, 2013

mirkokiefer commented Jun 27, 2013

rvagg commented Jun 27, 2013

dominictarr commented Jul 1, 2013

rvagg commented Jul 1, 2013

mirkokiefer commented Jul 1, 2013

vweevers commented Sep 26, 2022

Snapshot API #47

Snapshot API #47

Comments

juliangruber commented May 13, 2013

db#snapshot()

snapshot#get(key, value[, opts])

snapshot#create{Read,Key,Value}Stream

snapshot#dispose()

snapshot#{del,createWriteStream,...}

Usage

rvagg commented May 13, 2013

juliangruber commented May 13, 2013

rvagg commented May 13, 2013

juliangruber commented May 13, 2013

juliangruber commented May 13, 2013

rvagg commented May 13, 2013

juliangruber commented May 13, 2013

rvagg commented May 13, 2013

mcollina commented May 13, 2013

juliangruber commented May 13, 2013

mcollina commented May 13, 2013

juliangruber commented May 13, 2013

kesla commented May 13, 2013

dominictarr commented May 13, 2013

rvagg commented May 13, 2013

mcollina commented May 13, 2013

rvagg commented May 13, 2013

juliangruber commented May 13, 2013

fritzy commented May 14, 2013

fritzy commented May 14, 2013

dominictarr commented May 14, 2013

mirkokiefer commented Jun 27, 2013

rvagg commented Jun 27, 2013

dominictarr commented Jul 1, 2013

rvagg commented Jul 1, 2013

mirkokiefer commented Jul 1, 2013

vweevers commented Sep 26, 2022