Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rethink: figure out how to deal with unreliable changefeed #269

Closed
aeneasr opened this issue Sep 29, 2016 · 10 comments
Closed

rethink: figure out how to deal with unreliable changefeed #269

aeneasr opened this issue Sep 29, 2016 · 10 comments
Labels
bug Something is not working. upstream Issue is caused by an upstream dependency.

Comments

@aeneasr
Copy link
Member

aeneasr commented Sep 29, 2016

This is an upstream issue: rethinkdb/rethinkdb#4133


Apparently, the rethinkdb changefeed is not reliable and deletes messages without aknowledgement. The rethinkdb docs made me think that getting reliable changefeeds is possible:

Since changefeeds are unidirectional with no acknowledgement returned from clients, they cannot guarantee delivery. If you need real-time updating with delivery guarantees, consider using a model that distributes to the clients through a message broker such as RabbitMQ.

but apparently this statement does not hold, because an integration such as RabbitMQ would also rely on changefeeds.

This issue will affect mostly deployments where hydra and rethinkdb are connected through the internet (eg home pc <-> aws) and not some intranet (eg same aws region / datacenter), as packages get lost more frequently over the internet. However, it will still happen on intranets, so this needs to be figured out for production environments.


Known workarounds:

  1. Run hydra and rethinkdb on the same host
@theflyingcoder
Copy link

At what point is it the most effective to add the fallback queries? In other words, how do we know when the data in memory/cache is not up-to-date?

@aeneasr
Copy link
Member Author

aeneasr commented Sep 29, 2016

good question, i was thinking about missing data, not outdated data. maybe I need to check how to set up rethinkdb + rabbitmq to do the job reliably.

@aeneasr
Copy link
Member Author

aeneasr commented Sep 29, 2016

however, I don't really get how rabbitmq is going to solve missing updates, it's still an app that listens to the changefeed... https://www.rethinkdb.com/docs/rabbitmq/javascript/

@aeneasr
Copy link
Member Author

aeneasr commented Sep 29, 2016

in general I don't think it's a good idea to add another external and hard to maintain dependency like rabbitmq on top of rethink. especially if rethinkdb/rethinkdb#6128 holds.

right now, I see several paths forward that would help those who want to use hydra in production:

  1. the obvious one: hope for the rethink community to help us on getting reliable changefeeds
  2. the "eventually consistent" one: refresh the in memory tables every X minutes (configurable). this would also resolve oauth2/rethinkdb: clear expired access tokens from memory #228 . this would be an interim solution if (1) is likely to happen in the future
  3. the last resort one: switch to a different storage for the stable release (rethinkdb will still be avilable but might not be supported officially). I have no clear idea what and how though, maybe ampq, maybe something else.

in general, this problem will come up in a production environment, but it will be much less frequent compared to home pc <-> hosted rethinkdb, especially if both are hosted in the same region / datacenter. so maybe a combination of (2) and the original idea (fetching non existing stuff from db) can be a good middle-ground.

what do you think?

@aeneasr aeneasr changed the title rethink: add fallback that queries database if values can not be found rethink: figure out how to deal with unreliable changefeed Sep 29, 2016
@aeneasr
Copy link
Member Author

aeneasr commented Sep 29, 2016

I just found this section in the rethinkdb docs, which gives me hope that changefeeds will be improved:

Warning! If the RethinkDB river plugin loses connection with the RethinkDB server it’s pulling data from, there’s no way to guarantee no documents will be lost. This should change in the future with improvements to changefeeds, but currently the only way to be sure is to backfill every time, which will still miss deleted documents.

@aeneasr
Copy link
Member Author

aeneasr commented Sep 29, 2016

This thread shows how to simulate network loss of rethinkdb nodes using iptables, might be useful for generic rdb error tracing: https://groups.google.com/forum/#!searchin/rethinkdb/changefeed|sort:relevance/rethinkdb/7n_lBN6CKoM/iCy-LYpNIAAJ

@aeneasr
Copy link
Member Author

aeneasr commented Sep 29, 2016

The good news is, there are plans to support reliable changefeeds:

There are basically two planned degrees of reliable changefeeds:

  • Surviving short disconnects, as long as the client and servers remain up. This is the part for which we have settled an API so far. In this mode, either no change will get lost and the changefeed can just be resumed, or all changes will get lost and the changefeed will need to be restarted.
  • Surviving restarts and disconnects of both the client or server. Even permanent server failures can be sustained, as long as enough replicas are left. Picking up at a given point will be based on some sort of b-tree timestamp token that the client needs to persist (if client restarts should be survived without starting over from scratch). With this approach, a changefeed can always be resumed. However it will have "squash"-like semantics, i.e. it will omit intermediate values of any documents. It will also require an additional "delete range" notification and will sometimes emit changes for documents that weren't actually changed. There will be further restrictions, e.g. on the types of queries on which such a changefeed can be used. The goal of this mode is to keep a copy of the data synced with the current table state. A primary use case is for replicating RethinkDB data into a different secondary data store, such as ElasticSearch. For this mode, the API and exact behavior are not settled yet.

See: rethinkdb/rethinkdb#3471 (comment)

This makes option (2) a viable one.

@aeneasr
Copy link
Member Author

aeneasr commented Sep 30, 2016

from a community member on slack:

Restartable change feeds might be at 2.6. I think for 2.5 they aim to get changes() on joins. This is not official or set in stone but rough estimate

The idea I think is to know what the last received change was and replay from that onwards. So rethink will buffer these changes in the background. Tho if this is the case its still not reliable when rethink is the one going down. For that we would need persistance of some kind

[10:31] there's not definite timeframe, but I think 2.6 is due 1-2 quarter of 2017
[10:32] 2.4 should be coming out in a month or so
[10:32] 3ish months for 2.5 and another for 2.6 I think is as accurate as anyone can predict

don't take what I said as something that holds too much value. It's just "a feeling". They never promised any of that, it's just how things were a month ago when I had a convo about this with a dev

@aeneasr aeneasr added bug Something is not working. upstream Issue is caused by an upstream dependency. and removed feat New feature or request. labels Oct 2, 2016
@aeneasr
Copy link
Member Author

aeneasr commented Oct 2, 2016

There are some better options available until this is being fixed upstream, in particular:

  • includeInitial
  • includeStates
  • includeTypes

See also: https://www.rethinkdb.com/api/javascript/changes/

@aeneasr
Copy link
Member Author

aeneasr commented Oct 9, 2016

RethinkDB will no longer be actively maintained without customer requests and is superseded by #292

@aeneasr aeneasr closed this as completed Oct 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working. upstream Issue is caused by an upstream dependency.
Projects
None yet
Development

No branches or pull requests

2 participants