rethink: figure out how to deal with unreliable changefeed #269

aeneasr · 2016-09-29T09:56:56Z

This is an upstream issue: rethinkdb/rethinkdb#4133

Apparently, the rethinkdb changefeed is not reliable and deletes messages without aknowledgement. The rethinkdb docs made me think that getting reliable changefeeds is possible:

Since changefeeds are unidirectional with no acknowledgement returned from clients, they cannot guarantee delivery. If you need real-time updating with delivery guarantees, consider using a model that distributes to the clients through a message broker such as RabbitMQ.

but apparently this statement does not hold, because an integration such as RabbitMQ would also rely on changefeeds.

This issue will affect mostly deployments where hydra and rethinkdb are connected through the internet (eg home pc <-> aws) and not some intranet (eg same aws region / datacenter), as packages get lost more frequently over the internet. However, it will still happen on intranets, so this needs to be figured out for production environments.

Known workarounds:

Run hydra and rethinkdb on the same host

theflyingcoder · 2016-09-29T20:42:38Z

At what point is it the most effective to add the fallback queries? In other words, how do we know when the data in memory/cache is not up-to-date?

aeneasr · 2016-09-29T20:47:29Z

good question, i was thinking about missing data, not outdated data. maybe I need to check how to set up rethinkdb + rabbitmq to do the job reliably.

aeneasr · 2016-09-29T20:53:57Z

however, I don't really get how rabbitmq is going to solve missing updates, it's still an app that listens to the changefeed... https://www.rethinkdb.com/docs/rabbitmq/javascript/

aeneasr · 2016-09-29T22:55:40Z

in general I don't think it's a good idea to add another external and hard to maintain dependency like rabbitmq on top of rethink. especially if rethinkdb/rethinkdb#6128 holds.

right now, I see several paths forward that would help those who want to use hydra in production:

the obvious one: hope for the rethink community to help us on getting reliable changefeeds
the "eventually consistent" one: refresh the in memory tables every X minutes (configurable). this would also resolve oauth2/rethinkdb: clear expired access tokens from memory #228 . this would be an interim solution if (1) is likely to happen in the future
the last resort one: switch to a different storage for the stable release (rethinkdb will still be avilable but might not be supported officially). I have no clear idea what and how though, maybe ampq, maybe something else.

in general, this problem will come up in a production environment, but it will be much less frequent compared to home pc <-> hosted rethinkdb, especially if both are hosted in the same region / datacenter. so maybe a combination of (2) and the original idea (fetching non existing stuff from db) can be a good middle-ground.

what do you think?

aeneasr · 2016-09-29T23:07:15Z

I just found this section in the rethinkdb docs, which gives me hope that changefeeds will be improved:

Warning! If the RethinkDB river plugin loses connection with the RethinkDB server it’s pulling data from, there’s no way to guarantee no documents will be lost. This should change in the future with improvements to changefeeds, but currently the only way to be sure is to backfill every time, which will still miss deleted documents.

aeneasr · 2016-09-29T23:26:56Z

This thread shows how to simulate network loss of rethinkdb nodes using iptables, might be useful for generic rdb error tracing: https://groups.google.com/forum/#!searchin/rethinkdb/changefeed|sort:relevance/rethinkdb/7n_lBN6CKoM/iCy-LYpNIAAJ

aeneasr · 2016-09-29T23:51:49Z

The good news is, there are plans to support reliable changefeeds:

There are basically two planned degrees of reliable changefeeds:

Surviving short disconnects, as long as the client and servers remain up. This is the part for which we have settled an API so far. In this mode, either no change will get lost and the changefeed can just be resumed, or all changes will get lost and the changefeed will need to be restarted.

Surviving restarts and disconnects of both the client or server. Even permanent server failures can be sustained, as long as enough replicas are left. Picking up at a given point will be based on some sort of b-tree timestamp token that the client needs to persist (if client restarts should be survived without starting over from scratch). With this approach, a changefeed can always be resumed. However it will have "squash"-like semantics, i.e. it will omit intermediate values of any documents. It will also require an additional "delete range" notification and will sometimes emit changes for documents that weren't actually changed. There will be further restrictions, e.g. on the types of queries on which such a changefeed can be used. The goal of this mode is to keep a copy of the data synced with the current table state. A primary use case is for replicating RethinkDB data into a different secondary data store, such as ElasticSearch. For this mode, the API and exact behavior are not settled yet.

See: rethinkdb/rethinkdb#3471 (comment)

This makes option (2) a viable one.

aeneasr · 2016-09-30T08:36:56Z

from a community member on slack:

Restartable change feeds might be at 2.6. I think for 2.5 they aim to get changes() on joins. This is not official or set in stone but rough estimate

The idea I think is to know what the last received change was and replay from that onwards. So rethink will buffer these changes in the background. Tho if this is the case its still not reliable when rethink is the one going down. For that we would need persistance of some kind

[10:31] there's not definite timeframe, but I think 2.6 is due 1-2 quarter of 2017
[10:32] 2.4 should be coming out in a month or so
[10:32] 3ish months for 2.5 and another for 2.6 I think is as accurate as anyone can predict

don't take what I said as something that holds too much value. It's just "a feeling". They never promised any of that, it's just how things were a month ago when I had a convo about this with a dev

aeneasr · 2016-10-02T10:17:04Z

There are some better options available until this is being fixed upstream, in particular:

includeInitial
includeStates
includeTypes

See also: https://www.rethinkdb.com/api/javascript/changes/

aeneasr · 2016-10-09T11:40:50Z

RethinkDB will no longer be actively maintained without customer requests and is superseded by #292

aeneasr added the feat New feature or request. label Sep 29, 2016

aeneasr added this to the milestone-gamma: production-readyness focus, client libraries for js/php/ruby/... milestone Sep 29, 2016

aeneasr changed the title ~~rethink: add fallback that queries database if values can not be found~~ rethink: figure out how to deal with unreliable changefeed Sep 29, 2016

aeneasr added bug Something is not working. upstream Issue is caused by an upstream dependency. and removed feat New feature or request. labels Oct 2, 2016

aeneasr closed this as completed Oct 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rethink: figure out how to deal with unreliable changefeed #269

rethink: figure out how to deal with unreliable changefeed #269

aeneasr commented Sep 29, 2016 •

edited

Loading

theflyingcoder commented Sep 29, 2016

aeneasr commented Sep 29, 2016 •

edited

Loading

aeneasr commented Sep 29, 2016

aeneasr commented Sep 29, 2016 •

edited

Loading

aeneasr commented Sep 29, 2016

aeneasr commented Sep 29, 2016 •

edited

Loading

aeneasr commented Sep 29, 2016

aeneasr commented Sep 30, 2016 •

edited

Loading

aeneasr commented Oct 2, 2016

aeneasr commented Oct 9, 2016

rethink: figure out how to deal with unreliable changefeed #269

rethink: figure out how to deal with unreliable changefeed #269

Comments

aeneasr commented Sep 29, 2016 • edited Loading

theflyingcoder commented Sep 29, 2016

aeneasr commented Sep 29, 2016 • edited Loading

aeneasr commented Sep 29, 2016

aeneasr commented Sep 29, 2016 • edited Loading

aeneasr commented Sep 29, 2016

aeneasr commented Sep 29, 2016 • edited Loading

aeneasr commented Sep 29, 2016

aeneasr commented Sep 30, 2016 • edited Loading

aeneasr commented Oct 2, 2016

aeneasr commented Oct 9, 2016

aeneasr commented Sep 29, 2016 •

edited

Loading

aeneasr commented Sep 29, 2016 •

edited

Loading

aeneasr commented Sep 29, 2016 •

edited

Loading

aeneasr commented Sep 29, 2016 •

edited

Loading

aeneasr commented Sep 30, 2016 •

edited

Loading