Skip to content
This repository has been archived by the owner on Apr 14, 2022. It is now read-only.

014-safekeeper-gossip #13

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

014-safekeeper-gossip #13

wants to merge 1 commit into from

Conversation

kelvich
Copy link
Contributor

@kelvich kelvich commented Jan 17, 2022

Safekeeper gossip

Motivation

In some situations, safekeeper (SK) needs coordination with other SK's that serve the same tenant:

  1. WAL deletion. SK needs to know what WAL was already safely replicated to delete it. Now we keep WAL indefinitely.
  2. Deciding on who is sending WAL to the pageserver. Now sending SK crash may lead to a livelock where nobody sends WAL to the pageserver.
  3. To enable SK to SK direct recovery without involving the compute

Summary

Compute node has connection strings to each safekeeper. During each compute->safekeeper connection establishment, the compute node should pass down all that connection strings to each safekeeper. With that info, safekeepers may establish Postgres connections to each other and periodically send ping messages with LSN payload.

Read more

@stepashka stepashka changed the title Safekeeper gossip description Safekeepers coordination (via gossip) Jan 18, 2022
@stepashka stepashka added c/safekeeper Component: safekeeper p/wal Pageserver: relates to WAL processing a/reliability Area: relates to reliability of the service labels Jan 18, 2022
@stepashka stepashka added the t/tech_design_rfc Type: tech design RFC label Jan 18, 2022
@kelvich
Copy link
Contributor Author

kelvich commented Jan 19, 2022

I'm going to open a new one in zenith repo

@kelvich kelvich closed this Jan 19, 2022
@kelvich kelvich reopened this Jan 19, 2022
@kelvich kelvich changed the title Safekeepers coordination (via gossip) 014-safekeeper-gossip Jan 19, 2022
@kelvich kelvich mentioned this pull request Jan 19, 2022

## Proposed implementation

Each safekeeper can periodically ping all its peers and share connectivity and liveness info. If the ping was not receiver for, let's say, four ping periods, we may consider sending safekeeper as dead. That would mean some of the alive safekeepers should connect to the pageserver. One way to decide which one exactly: `make_connection = my_node_id == min(alive_nodes)`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a safekeeper fails, isn't that a human intervention scenario anyway? Or do we have a membership change implementation? It's a tricky thing to get right

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
a/reliability Area: relates to reliability of the service c/safekeeper Component: safekeeper p/wal Pageserver: relates to WAL processing t/tech_design_rfc Type: tech design RFC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants