pgBackRest: Fix timeline divergence after PITR #820

SDV109 · 2024-11-27T08:38:30Z

This PR is intended to be fixed during PITR, when at the end of the playbook, the replicas were on a timeline different from the master node.

The first solution that helped fix the error on the timeline was to restart patroni on replica nodes, which causes pg_rewind to run between replicas and the master, and the replicas received the necessary WAL files from the master node to align the timeline. But it didn't look like the right decision.

Also, when replicas were in the first timeline after recovery, errors were observed in the logs on the master node that a replication slot could not be created for the replica node:

STATEMENT:  START_REPLICATION SLOT "pgnode03" 0/76000000 TIMELINE 1
ERROR:  requested starting point 0/76000000 on timeline 1 is not in this server's history
DETAIL:  This server's history forked from timeline 1 at 0/73000498.
STATEMENT:  START_REPLICATION SLOT "pgnode02" 0/76000000 TIMELINE 1
ERROR:  requested starting point 0/76000000 on timeline 1 is not in this server's history
DETAIL:  This server's history forked from timeline 1 at 0/73000498.

The official documentation says that if target-action=shutdown is used, the recovery.signal file is not deleted, which prevents subsequent PostgreSQL launches in the cluster, since the server will wait for further recovery from the WAL repository, where the necessary files are missing, since after recovery it is necessary to make a new backup. A simple intervention with deleting the recovery.signal file before running patroni on replicas does not help solve the problem with the missing timeline.

The solution is to change the target-action for replicas from shutdown to pause, in this case the replica starts as a ready-made PostgreSQL instance and immediately connects to the master node and pulls the necessary WALs from it for the desired timeline.

Changing the target-action parameter from shutdown to pause

Fix PITR

ff89cb8

Changing the target-action parameter from shutdown to pause

vitabaks changed the title ~~Fix PITR~~ Fix timeline divergence after PITR with shutdown target action Nov 27, 2024

vitabaks changed the title ~~Fix timeline divergence after PITR with shutdown target action~~ Fix timeline divergence after PITR Nov 27, 2024

vitabaks changed the title ~~Fix timeline divergence after PITR~~ pgBackRest: Fix timeline divergence after PITR Nov 27, 2024

vitabaks mentioned this pull request Nov 27, 2024

specified neither primary_conninfo nor restore_command on restore from scratch #803

Open

vitabaks merged commit 90948bc into vitabaks:master Nov 28, 2024
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pgBackRest: Fix timeline divergence after PITR #820

pgBackRest: Fix timeline divergence after PITR #820

SDV109 commented Nov 27, 2024

pgBackRest: Fix timeline divergence after PITR #820

pgBackRest: Fix timeline divergence after PITR #820

Conversation

SDV109 commented Nov 27, 2024