Fixed handling of interval tree clocks #85

tschmittni · 2018-04-18T11:02:41Z

These rules have to be followed to make ITCs work properly:

Never store a remote clock in the local registry. Remote clocks
have no identity (or the wrong identity). There have been many
places where remote clocks have been stored in the local registry.
This should be always wrong, because remote clocks either have
no identity (if peeked) or the wrong identity of another node.
Use Clock.join() to receive causal information from remote (peeked)
clocks. Clock.join() has not been used anywhere in the Tracker,
which means that causal information from other nodes has never
updated after a broadcast.
Always call Clock.peek() before sending clocks to other replicas.
This avoids a couple bugs when the local node stores the wrong
clock identity.
Make sure that every node gets a properly forked clock when
joining the cluster.

Removed handle_replica calls for :add_meta and :remove_meta because
clocks are stored with the metadata itself and they cannot determine
the causal ordering of add and remove events. Using :update_meta for
all replica events.

Changed broadcast_event and added get_registry_snapshot wrapper
to make sure that only peeked clocks are send over to other nodes.

The TrackerState clock is now only used to hold the clock identity.

These rules have to be followed to make ITCs work properly: - Never store a remote clock in the local registry. Remote clocks have no identity (or the wrong identity). There have been many places where remote clocks have been stored in the local registry. This should be always wrong, because remote clocks either have no identity (if peeked) or the wrong identity of another node. - Use Clock.join() to receive causal information from remote (peeked) clocks. Clock.join() has not been used anywhere in the Tracker, which means that causal information from other nodes has never updated after a broadcast. - Always call Clock.peek() before sending clocks to other replicas. This avoids a couple bugs when the local node stores the wrong clock identity. - Make sure that every node gets a properly forked clock when joining the cluster. Removed handle_replica calls for :add_meta and :remove_meta because clocks are stored with the metadata itself and they cannot determine the causal ordering of add and remove events. Using :update_meta for all replica events. Changed broadcast_event and added get_registry_snapshot wrapper to make sure that only peeked clocks are send over to other nodes. The TrackerState clock is now only used to hold the clock identity.

tschmittni · 2018-04-18T11:04:02Z

This change should address the issues I outlined here:
#21

- Add tests for handling replication events: track, untrack and update_meta - Add tests for syncing registries

tschmittni · 2018-04-24T13:26:33Z

Added Tracker tests to validate handling of replication events and syncing registries

radu-iviniciu · 2018-05-10T10:42:29Z

We have been using this change in production for a couple of weeks now. Without any problems. It really fixes the occasional issues we saw with the clock handling

tschmittni · 2018-06-25T09:59:12Z

@bitwalker have you had time to look at this change?

arjan · 2018-07-23T18:42:48Z

@tschmittni I'm trying to get all Swarm tests passing, and some previously failing tests seem to be fixed on your ITC branch, which is awesome :-)

However, your most recent commit, in which you added test/tracker_sync_test.exs, that particular testcase has 4 out of 7 failing tests.. are you still working on this or should I investigate myself?

tschmittni · 2018-07-24T07:14:35Z

Thank you!

It looks like the tests are not working in "clustered" mode. But you can run them with:
mix test test/tracker_sync_test.exs --exclude clustered

Let me know if you want me to change anything with the tests. Otherwise, we can also just drop them if you prefer other type of tests.

arjan · 2018-07-24T07:25:51Z

Thanks, that helps. The correct branch is itc/refactored-itc-usage-tests2, right?

tschmittni · 2018-07-24T07:29:17Z

The latest branch should be the one in this pull request: https://github.com/tschmittni/swarm/tree/refactored-itc-usage

arjan · 2018-07-24T07:46:28Z

I'll see what I can do to get them running with :clustered; they're the only tests right now that fail.

arjan · 2018-07-24T09:30:27Z

Fixed the tests, see my #94 PR

This is a follow up of bitwalker#85 I forgot to change the handle_topology_change method which is as far as I can tell also the root cause for bitwalker#106

Add tests for Swarm.Tracker

bb93058

- Add tests for handling replication events: track, untrack and update_meta - Add tests for syncing registries

arjan mentioned this pull request Jul 24, 2018

Fix test cases #94

Merged

tschmittni mentioned this pull request Aug 6, 2018

No function clause matching in Swarm.IntervalTreeClock.fill/2 #97

Closed

bitwalker merged commit a606566 into bitwalker:master Aug 25, 2018

arjan mentioned this pull request Sep 16, 2018

CaseClauseError #103

Open

tschmittni mentioned this pull request Nov 14, 2018

Do not update tracker clock with registry entry clocks #116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed handling of interval tree clocks #85

Fixed handling of interval tree clocks #85

tschmittni commented Apr 18, 2018

tschmittni commented Apr 18, 2018

tschmittni commented Apr 24, 2018

radu-iviniciu commented May 10, 2018

tschmittni commented Jun 25, 2018

arjan commented Jul 23, 2018

tschmittni commented Jul 24, 2018

arjan commented Jul 24, 2018

tschmittni commented Jul 24, 2018

arjan commented Jul 24, 2018

arjan commented Jul 24, 2018

Fixed handling of interval tree clocks #85

Fixed handling of interval tree clocks #85

Conversation

tschmittni commented Apr 18, 2018

tschmittni commented Apr 18, 2018

tschmittni commented Apr 24, 2018

radu-iviniciu commented May 10, 2018

tschmittni commented Jun 25, 2018

arjan commented Jul 23, 2018

tschmittni commented Jul 24, 2018

arjan commented Jul 24, 2018

tschmittni commented Jul 24, 2018

arjan commented Jul 24, 2018

arjan commented Jul 24, 2018