Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push based replication #589

Open
adzialocha opened this issue Nov 5, 2023 · 1 comment
Open

Push based replication #589

adzialocha opened this issue Nov 5, 2023 · 1 comment

Comments

@adzialocha
Copy link
Member

Only start new replication session when we have new data for another node.

The logic behind this is also needed for client subscriptions.

@adzialocha
Copy link
Member Author

Out of the top of my head (no laptop with me right now) some rough ideas, probably need improvement:

  • Keep a new struct Connections / Peers etc. in the global Context, similar to schema manager
  • The purpose of it is to manage a list of all currently connected / subscribed clients and (later) nodes, probably in form of a map with some sort of unique subscription id -> subscription enum + callback list mapping while the enum is either a SchemaIdSet (collection queries and node announcements) or DocumentId (document queries)
  • Since this thing lives in global context now we can reach it from anywhere: the replication service can populate it with currently known nodes (it learns about it through peer dis-/connected messages) and the graphql service (whenever a subscription starts or ends)
  • Later (not for now) we can use this state to learn about how many nodes I'm currently connected to etc. (Could be an protected graphql or public crate method)
  • Callbacks are added and removed to the map scoped by this unique id and sets of interest
  • A small subscription service is introduced (like all other services). It hooks into the message bus and waits for materializer events. Whenever a document got touched we want to know its schema id and document id (probably that's part of the message itself), so this needs to be sent from the materializer and received here
  • The service as all others also has access to the context and therefore connection table. It looks up on each incoming message if someone is subscribed to that document or that schema and accordingly calls the callbacks
  • For nodes we probably then just want to send a message to the replication service, telling it to begin replication
  • For clients the callback probably just kicks in the subscription callback which repeats the query over the collection or document. As mentioned already I think it's fine if filters are currently ignored at this stage
  • That's probably also not part of your PRs but note for later: our replication service should know if it is in live mode with this peer already, if yes it will just send the new data to it (later just via a gossip overlay broadcast which is also just sort of subscription over a topic / schema id set), if not it will just initiate a normal 1:1 replication session as we currently already do but it would become push-based which is nice
  • Surely that subscription logic could live in another service, not it's own, not sure if it needs it's own place already. On the other hand, it doesn't do anything clearly related to one service and on top it might handle other things in the future like automatically removing timed-out callbacks etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant