SG handling for mutations deduplicated by Couchbase feed #775

adamcfraser · 2015-04-08T21:39:51Z

Both TAP and DCP include the ability to deduplicate mutations by DocID. The result is that Sync Gateway may not see a sequence for an interim mutation.

This won't result in any data loss from Sync Gateway's perspective, but it can result in delays in the Sync Gateway _changes feed, while it waits for the sequence from the deduped mutation.

In SG 1.0.4 and earlier, rapid mutations to a single doc have the potential for introducing significant latency into the _changes feed. Post-1.0.4 this is mitigated by the the fix for #525, which makes missing sequences non-blocking.

However, there is still processing and performance overhead in Sync Gateway for the deduped sequences - they keep the 'low' sequence number unnecessarily low until they are eventually abandoned, and there's overhead when we eventually prune the sequence.

A possible immediate-term approach would be to store the sequence number history in the _sync/history along with the revision number history, and use that to identify any sequences that didn't appear on the TAP feed. This increases the size of the _sync metadata, but is relatively small compared to the other revision-level metadata we already store.

Longer term, this would overlap with the discussion about using vbucket-striped sequences for change tracking.

adamcfraser · 2015-04-10T17:38:33Z

Potentially "enhance" Walrus to reproduce this issue for unit test purposes.

Issue #775 - handle sequences deduplicated by server feed

The tap de-dupe will be useful for creating an enhanced unit test for Issue #775. See #798

adamcfraser added the in progress label Apr 10, 2015

adamcfraser self-assigned this Apr 10, 2015

zgramana added bug P3: low S4: minor labels Apr 10, 2015

zgramana added ready and removed in progress labels Apr 17, 2015

zgramana added this to the 1.1.0 milestone Apr 17, 2015

adamcfraser added review and removed ready labels Apr 23, 2015

zgramana added known-issue release-prep and removed known-issue labels Apr 27, 2015

tleyden added a commit that referenced this issue Apr 27, 2015

Merge pull request #798 from couchbase/feature/issue_775

c359969

Issue #775 - handle sequences deduplicated by server feed

tleyden pushed a commit that referenced this issue Apr 28, 2015

Add tap de-dupe to leaky bucket, make configurable

e4fc016

The tap de-dupe will be useful for creating an enhanced unit test for Issue #775. See #798

tleyden mentioned this issue Apr 28, 2015

Add tap de-duplication functionality to leaky bucket, make configurable #804

Merged

tleyden pushed a commit that referenced this issue Apr 28, 2015

Add tap de-dupe to leaky bucket, make configurable

0dfd237

The tap de-dupe will be useful for creating an enhanced unit test for Issue #775. See #798

adamcfraser mentioned this issue Apr 28, 2015

Sync Gateway crashes while processing the changes feed #809

Closed

adamcfraser closed this as completed May 1, 2015

adamcfraser removed the review label May 1, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SG handling for mutations deduplicated by Couchbase feed #775

SG handling for mutations deduplicated by Couchbase feed #775

adamcfraser commented Apr 8, 2015

adamcfraser commented Apr 10, 2015

SG handling for mutations deduplicated by Couchbase feed #775

SG handling for mutations deduplicated by Couchbase feed #775

Comments

adamcfraser commented Apr 8, 2015

adamcfraser commented Apr 10, 2015