Add per key actor epochs #1040

russelldb · 2014-11-11T11:04:20Z

See riak_kv#679 and the associated platform_task RFC and summary.

This PR addresses the "doomstone", backup-restore, and some byzantine flavours of the kv679 bug.

The RFC explains the mechanism in detail but briefly:

add a persisted to disk vnode counter (persisted with leases, aysnc)
when ever a key is written to for the "first time" by a vnode, create an epoch actor for the key by concatenating vnodeid+counter (and increment the counter)

This ensures that a first time write for a key gets a new actor, this is the epoch for the key. It means we don't mix up deleted+re-created keys {a,1} event with the original {a, 1} event for some key, by ensuring an actor per epoch, without causing a keyspace wide actor explosion.

I hope that is enough details for the review to start. I'll start work on EQC for the status_mgr and the new vnode functions in the mean time.

use the counter to create a per epoch key when local not found.

When it hits that limit create a new vnodeid.

Start a new epoch if the local object has a lower epoch for the vnode id than the incoming object (no epoch is lower, no epich is zero)

Bad match for re-refactord put_merge return. Correct return type for highest_actor

use the bare vnodeid as long as possible, only start a new epoch when needed. In a way the bare vnodeid is just epoch zero, so use it. Also, consider the incoming counter when deciding about new epochs. An incoming clock with the same actor+epoch but greater counter is a hint that a byzantine failure occurred and a new epoch is needed

NOTE: make dialyzer exits with an error, but no information. Could do with some help on that.

seancribbs · 2014-11-20T17:05:17Z

src/riak_kv_vnode.erl

+    State;
+maybe_lease_counter(State) ->
+    #state{status_mgr_pid=MgrPid, counter=CS=#counter_state{cnt=Cnt, lease=Lease, lease_size=LeaseSize}} = State,
+    %% @TODO configurable??


Yes, this should probably be configurable/tunable.

Yeah. It is hard to decide if it should be. Trying to figure out a scenario where you'd want to change this. And adding it adds more chance to shoot yourself in the foot. And more complex validation of the config item. And can it change in the vnode's lifetime? Do we need to add it to state, or check the env var every iteration?

I'm not against making it configurable, I'd just like a better understanding of the benefit vs. risk/complexity.

seancribbs · 2014-11-20T17:49:52Z

src/riak_kv_vnode_status_mgr.erl

+          vnode_pid :: undefined | pid()
+         }).
+
+-type status() :: [proplists:property()] | [].


The empty list is not strictly necessary in this type. Also, is there any case where you will have bare atoms in the list? If not, orddict:orddict() might be better all around.

I don't know, it is inherited.

russelldb · 2014-11-24T17:33:24Z

I rebased against 2.0 before it moves too far away. I addressed your comments. Sadly I can't push to this branch now. Instead I opened a new PR and ref'd this one from riak_kv#1053

russelldb added 9 commits November 11, 2014 14:53

WIP move vnode persistent state into new module/process

5f3ae9e

WIP kv679 fixes. Adds a per vnode monotonic counter

55884cc

use the counter to create a per epoch key when local not found.

WIP: don't allow the counter to get above 32bit int size

478d29d

When it hits that limit create a new vnodeid.

WIP fix xref errors

01bb12d

WIP fix dialyzer

fe422fc

WIP really fix dialyzer

c9d42e1

WIP only increment vnode counter on new per key epoch

0fbbec8

Start a new epoch if the local object has a lower epoch for the vnode id than the incoming object (no epoch is lower, no epich is zero)

WIP fix new dialyzer errors

6d96d81

Bad match for re-refactord put_merge return. Correct return type for highest_actor

WIP update the vnode status command as per fix on 2.0 branch

dc581bd

russelldb force-pushed the bug/rdb/gh679-actor-epochs branch from 194e222 to dc581bd Compare November 11, 2014 15:06

russelldb added 8 commits November 11, 2014 17:24

Fix failing stats test by using correct vnode status format

412125b

Fix some doco/spec errors, fix an error getting the timeout for leases

8860971

Shhhh, loud test.

a96355f

Add some docs and specs and fix the transposed args that dialyzer found

7386fb0

NOTE: make dialyzer exits with an error, but no information. Could do with some help on that.

fix filename spec typo

f995742

Add a status command for testing purposes

5bb0022

WIP for help on test

acce03e

seancribbs reviewed Nov 20, 2014
View reviewed changes

russelldb mentioned this pull request Nov 24, 2014

per-key actor epoch (kv679) rebased on 2.0 #1053

Closed

russelldb closed this Nov 24, 2014

russelldb mentioned this pull request Dec 30, 2014

Rebase per-key actor epoch (kv679) on 2.0 [JIRA: RIAK-1331] #1070

Closed

4 tasks

seancribbs deleted the bug/rdb/gh679-actor-epochs branch April 1, 2015 23:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add per key actor epochs #1040

Add per key actor epochs #1040

russelldb commented Nov 11, 2014

seancribbs Nov 20, 2014

russelldb Nov 24, 2014

seancribbs Nov 20, 2014

russelldb Nov 21, 2014

russelldb commented Nov 24, 2014

Add per key actor epochs #1040

Add per key actor epochs #1040

Conversation

russelldb commented Nov 11, 2014

seancribbs Nov 20, 2014

Choose a reason for hiding this comment

russelldb Nov 24, 2014

Choose a reason for hiding this comment

seancribbs Nov 20, 2014

Choose a reason for hiding this comment

russelldb Nov 21, 2014

Choose a reason for hiding this comment

russelldb commented Nov 24, 2014