Max order comparator #2

santialvarezcolombo · 2016-10-26T23:38:07Z

The new logging and caching module for antidote, will use eleveldb as one of its possible backends. In order to use eleveldb, we needed to code a new comparator, so keys get sorted as we want.

Keys are composed as follows:

{antidote_key, max_value_in_vc, vc_hash, op/snap, vc}

antidote_key: Same key as in Antidote.
max_value_in_vc: The max value for all DCs in the VC.
vc_hash: A hash of the VC. This is done in order to provide a more performant sorting algorithm, since we don't need to compare all the VC, to find that keys are different. If the hash matches, we also compare the VC, just to make sure it's not a collision.
op/snap: an atom indicating if the stored value corresponds to an operation or a snapshot.
vc: the vc of when the op/snap ocurred.

Sorting of keys

Keys get sorted first by the Antidote key.
Then we look at the MAX value of it's VC, sorting first the biggest one. We reverse the "natural" order, so we can have the most recent operations first.
If the max value for two keys matches, we sort taking into account the hash value.
For matching hashes, we check the rest of the key, sorting accordingly.

As Erlang serialises objects sent to eleveldb, according http://erlang.org/doc/apps/erts/erl_ext_dist.html, so what we do is check those code to compare keys.

* develop: (69 commits) add 2.0.27 info revert leveldb version to blank before merge to develop branch address code review issues. revert worker thread default to 71 due to fact some customers have static advanced.conf files that do NOT set thread count. so their production environment would drop from 71 threads to 7 ... oops. adjust code to use 7 worker threads in non-production modes and/or developer mode. 71 threads will still be assigned by cuttlefish defaults for production. update test for default compression format to lz4. 2.0 branch not backported to develop remove double delete at line 148 by using RefDecNoDelete() instead of RefDec(). Use accessors instead direct variable access to enable memory fencing move m_CloseRequest to protected, forcing code to use accessors ... which now provides memory fencing address Issue 212, cleanup unused writebatch upon error. Switch access to m_CloseRequested to functions instead of direct access. Allows easy memory fencing. branch specific change disable time conversion in debug routine currently not used. update release info for 2.0.25 make lz4 the default compression instead of snappy Adjust the disabling of expiry_minutes to use unlimited keyword. revert to expiry module per vnode, allows multibackend differences update expiry cuttlefish params to match notes from cv create compression multi_backend test remove _Default case in compression translation block Adjusted open_options type definition to include proper type for compression tag. update -type and -spec for compression ...

…veldb into antidote_comparator * 'antidote_comparator' of https://github.com/SyncFree/eleveldb:

* develop: Make multi_backend compression settings commented update BASHO_RELEASES change default compression expectation in basic_schema_test add description to compression algorithms snappy as default if nothing in riak.conf, default riak.conf sets lz4

the values of each one from max to min

peterzeller · 2016-10-27T09:06:33Z

I wonder if you still need the custom comparator, if you don't have to compare vector clocks any more.

I think you could now use Russells approach (here at 33:40 he explains it in a talk) and just format the key such that the native ordering works. He used 0-bytes in the key to separate the different parts. For comparing numbers you would probably have to pad them with zeros to make them comparable with the native ordering.

I don't know if that would be faster, but I think it would be easier to maintain if we later want to update eleveldb or if we want to add big-sets to Antidote.

Another comment: Wouldn't it make sense to put the op/snap part more to the front, so that snapshots and operations are stored separately?

santialvarezcolombo · 2016-10-27T12:35:23Z

We need the comparator because it makes the lookup for partial keys work. What I mean by partial keys is this. If you simply reverse the order, that fold doesn't match any starting key, and therefore, folds all the DB.

I was planning to first get this version working in Antidote, and then use what Russell commented of getting rid of the C++ code, which I would really like too.

I can try out the op/snap thing once this code is working on Antidote and check if there is any difference.

marc-shapiro · 2016-10-27T15:37:40Z

Santiago,

Repeating the suggestion by Russell: instead of coding a new comparator (which implies complexity and overhead), couldn't you just encode the LevelDB key as you mention below, so that the default comparison operator returns the right thing (or close enough)?

                        Marc

Le 27 oct. 2016 à 01:38, Santiago Alvarez Colombo notifications@github.com a écrit :

The new logging and caching module for antidote, will use eleveldb as one of its possible backends. In order to use eleveldb, we needed to code a new comparator, so keys get sorted as we want.

Keys are composed as follows:

{antidote_key, max_value_in_vc, vc_hash, op/snap, vc}

antidote_key: Same key as in Antidote.
max_value_in_vc: The max value for all DCs in the VC.
vc_hash: A hash of the VC. This is done in order to provide a more performant sorting algorithm, since we don't need to compare all the VC, to find that keys are different. If the hash matches, we also compare the VC, just to make sure it's not a collision.
op/snap: an atom indicating if the stored value corresponds to an operation or a snapshot.
vc: the vc of when the op/snap ocurred.
Sorting of keys

Keys get sorted first by the Antidote key.
Then we look at the MAX value of it's VC, sorting first the biggest one. We reverse the "natural" order, so we can have the most recent operations first.
If the max value for two keys matches, we sort taking into account the hash value.
For matching hashes, we check the rest of the key, sorting accordingly.

As Erlang serialises objects sent to eleveldb, according http://erlang.org/doc/apps/erts/erl_ext_dist.html http://erlang.org/doc/apps/erts/erl_ext_dist.html, so what we do is check those code to compare keys.

You can view, comment on, or merge this pull request online at:

#2 #2
Commit Summary

first changes to add antidote comparator
fix elseif order
added erlang external format checks + comparison of antidote keys
fix bug in Slice size while copying
parse list size method
key compare finished. started with VC comparison
first version for VC comparison
revert change in checkList method
added missing tuple parsing
revrse sorting order. most recent VCs first
added code to provide folding
refactored VCs comparator to treat keys with different amount of VCs
refactored VCs comparator to treat keys with different amount of VCs
fixed bug while parsing ints
VCs keys are now atoms instead of ints
Parsing for empty lists (empty snapshots)
fix bug while comparing keys with != number of DCs
first implementation to use big numbers
fixed assert
fixed power
New comparator method, asuming VCs is a list sorted by DCs
Merge branch 'develop' into antidote_comparator
Merge branch 'develop' into antidote_comparator
Merge branch 'antidote_comparator' of https://github.com/SyncFree/eleveldb into antidote_comparator
Fixed comparison of keys with same VC to deferentiate op vs snaps.
Merge branch 'develop' into antidote_comparator
New comparator ignoring VCs name and only taking into consideration
New comparator function using only the MAX value in the VC
File Changes

A c_src/antidote.cc https://github.com/SyncFree/eleveldb/pull/2/files#diff-0 (218)
A c_src/antidote.h https://github.com/SyncFree/eleveldb/pull/2/files#diff-1 (11)
M c_src/eleveldb.cc https://github.com/SyncFree/eleveldb/pull/2/files#diff-2 (11)
M src/eleveldb.erl https://github.com/SyncFree/eleveldb/pull/2/files#diff-3 (1)
Patch Links:

https://github.com/SyncFree/eleveldb/pull/2.patch https://github.com/SyncFree/eleveldb/pull/2.patch
https://github.com/SyncFree/eleveldb/pull/2.diff https://github.com/SyncFree/eleveldb/pull/2.diff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub #2, or mute the thread https://github.com/notifications/unsubscribe-auth/AGWtn65zkaa-OqF-wg6RIbdm1vuVNiezks5q3-RggaJpZM4KhynI.

santialvarezcolombo · 2016-10-27T23:13:51Z

@marc-shapiro I'll talk to Russell again so we can get rid of this code faster...

…ator * upstream/develop: revert previous eleveldb.cc check-in add comments for 2.0.33 tag remove explicity dependency to mv-bucket-expiry prior to merge busted: failed to compile and unit test before check-in. tie this branch to same name in leveldb ... for testing Use CreateExpiryModule() instead of new so EE and OS version of leveldb operate correctly Port submit_to_thread_queue() from riak_ts-develop to develop more iterator hardening against two thread use case. moved from 2.0 branch. clean LEVELDB_VSN in build_deps.sh and update BASHO_RELEASES update to use combined leveldb branches: mv-no-md-expiry and mv-tuning8 create special case RefDec to deal with AAE multi-process issue. remove stale iterator debug logging code Make LevelIteratorWrap an embedded object instead of dynamic ref counted object. Switch the few remaining naked pointer uses to reference counted pointers. This fills some remaining race condition holes (of which one recently seen). update for tag 2.0.30 Don't use deprecated erlang:now/0 unsigned long for memory_sz so it compiles on scaleway / arm

santialvarezcolombo added 28 commits January 13, 2016 14:26

first changes to add antidote comparator

de5131f

fix elseif order

64a3474

added erlang external format checks + comparison of antidote keys

466f4b1

fix bug in Slice size while copying

c5da7d0

parse list size method

8a9b7dc

key compare finished. started with VC comparison

56aa2ff

first version for VC comparison

e20ac19

revert change in checkList method

dc7acf4

added missing tuple parsing

a6b9581

revrse sorting order. most recent VCs first

53df812

added code to provide folding

43c962c

refactored VCs comparator to treat keys with different amount of VCs

415e524

refactored VCs comparator to treat keys with different amount of VCs

7cde4b3

fixed bug while parsing ints

89ca787

VCs keys are now atoms instead of ints

e466ab0

Parsing for empty lists (empty snapshots)

a733095

fix bug while comparing keys with != number of DCs

ac4942b

first implementation to use big numbers

59d1133

fixed assert

3b62c91

fixed power

539552e

New comparator method, asuming VCs is a list sorted by DCs

8b54842

Merge branch 'antidote_comparator' of https://github.com/SyncFree/ele…

4669c3b

…veldb into antidote_comparator * 'antidote_comparator' of https://github.com/SyncFree/eleveldb:

Fixed comparison of keys with same VC to deferentiate op vs snaps.

ad1c770

New comparator ignoring VCs name and only taking into consideration

acfd2bd

the values of each one from max to min

New comparator function using only the MAX value in the VC

e6d633e

santialvarezcolombo mentioned this pull request Oct 26, 2016

Antidote comparator #1

Closed

santialvarezcolombo assigned bieniusa Oct 26, 2016

santialvarezcolombo assigned aletomsic Oct 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Max order comparator #2

Max order comparator #2

santialvarezcolombo commented Oct 26, 2016

peterzeller commented Oct 27, 2016

santialvarezcolombo commented Oct 27, 2016

marc-shapiro commented Oct 27, 2016

santialvarezcolombo commented Oct 27, 2016

Max order comparator #2

Are you sure you want to change the base?

Max order comparator #2

Conversation

santialvarezcolombo commented Oct 26, 2016

peterzeller commented Oct 27, 2016

santialvarezcolombo commented Oct 27, 2016

marc-shapiro commented Oct 27, 2016

santialvarezcolombo commented Oct 27, 2016