Push based replication #589

adzialocha · 2023-11-05T09:59:34Z

Only start new replication session when we have new data for another node.

The logic behind this is also needed for client subscriptions.

adzialocha · 2024-01-09T15:37:25Z

Out of the top of my head (no laptop with me right now) some rough ideas, probably need improvement:

Keep a new struct Connections / Peers etc. in the global Context, similar to schema manager
The purpose of it is to manage a list of all currently connected / subscribed clients and (later) nodes, probably in form of a map with some sort of unique subscription id -> subscription enum + callback list mapping while the enum is either a SchemaIdSet (collection queries and node announcements) or DocumentId (document queries)
Since this thing lives in global context now we can reach it from anywhere: the replication service can populate it with currently known nodes (it learns about it through peer dis-/connected messages) and the graphql service (whenever a subscription starts or ends)
Later (not for now) we can use this state to learn about how many nodes I'm currently connected to etc. (Could be an protected graphql or public crate method)
Callbacks are added and removed to the map scoped by this unique id and sets of interest
A small subscription service is introduced (like all other services). It hooks into the message bus and waits for materializer events. Whenever a document got touched we want to know its schema id and document id (probably that's part of the message itself), so this needs to be sent from the materializer and received here
The service as all others also has access to the context and therefore connection table. It looks up on each incoming message if someone is subscribed to that document or that schema and accordingly calls the callbacks
For nodes we probably then just want to send a message to the replication service, telling it to begin replication
For clients the callback probably just kicks in the subscription callback which repeats the query over the collection or document. As mentioned already I think it's fine if filters are currently ignored at this stage
That's probably also not part of your PRs but note for later: our replication service should know if it is in live mode with this peer already, if yes it will just send the new data to it (later just via a gossip overlay broadcast which is also just sort of subscription over a topic / schema id set), if not it will just initiate a normal 1:1 replication session as we currently already do but it would become push-based which is nice
Surely that subscription logic could live in another service, not it's own, not sure if it needs it's own place already. On the other hand, it doesn't do anything clearly related to one service and on top it might handle other things in the future like automatically removing timed-out callbacks etc.

adzialocha mentioned this issue Dec 9, 2023

GraphQL subscription support #604

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Push based replication #589

Push based replication #589

adzialocha commented Nov 5, 2023

adzialocha commented Jan 9, 2024

Push based replication #589

Push based replication #589

Comments

adzialocha commented Nov 5, 2023

adzialocha commented Jan 9, 2024