Add new sync method to support #617 using txid poller to fix #656 #811

mxsasha · 2023-07-11T13:31:13Z

This is a sync mechanism specific for authoritative deployments for providing standby and query-only secondaries. Particularly when mixing the two.

With #617, there is a lot of potential data in PostgreSQL that will not be included in NRTM. NRTM was always a poor solution for standby instances, with nrtm_access_list_unfiltered as a hack on top, and potential loss of journal data and any suppressed objects. PostgreSQL replication fixes a lot of these issues, including even retaining serials. However, preloaded data will go out of date on standby instances. Redis replication is not a full fix, as some of the preloaded data is in memory in worker processes.

PostgreSQL does support triggers on a replica as noted in #656, but not when using WAL streaming, which is the only reasonable option here, as logical streaming breaks sequences and seems to have issues with upserts. The idea of monitoring for records where updated is higher than the last check is insufficient, because it does not catch row deletions or certain suppression state changes.

Best option: select timestamp from pg_last_committed_xact(). This does not allow us to filter for object types. However, considering route(6) and soon as-set are preloaded, many transactions will require pre-load store updates. There will therefore be additional overhead, but at the benefit of always being sure preloaded data gets updated. Caveat: the timestamp is not database-specific, so does not work well when running multiple databases. But these seems like an acceptable cost for people who need to run hot standby servers and is a fairly solid fix for the long standing difficulties in these setups.

The suppression status will be replicated as well, i.e. the !f queries will also work and suppressed objects are not lost during a switchover. However, the local config state shown in !J may not be consistent. This needs to be reflected in the docs. Also, during switchover, which requires a restart anyways, PGP keys have to be reimported into the local keychain. Hard and not worthwhile to do while running as standby.

irrd/daemon/main.py

mxsasha added this to the IRRdv4 phase 3 milestone Jul 11, 2023

mxsasha self-assigned this Jul 11, 2023

Clean up some details in mirroring scheduler.

ce99b79

mxsasha force-pushed the txpoll branch 12 times, most recently from 3fc3a93 to f89a743 Compare July 11, 2023 15:05

try testing tx q

4cb5edc

mxsasha force-pushed the txpoll branch from f89a743 to 4cb5edc Compare July 11, 2023 15:09

mxsasha added 7 commits July 11, 2023 20:26

add signaller

02b65e2

add docs

d57a3b1

lint

f6cbdab

doc updates

35471c3

doc update

a623708

preload during ro

7e55f6b

add note about settings consistency

ef87bf6

mxsasha marked this pull request as ready for review July 12, 2023 13:49

mxsasha commented Jul 13, 2023

View reviewed changes

irrd/daemon/main.py Show resolved Hide resolved

flatten settings

e2964dc

mxsasha force-pushed the txpoll branch from 671e5fd to e2964dc Compare July 13, 2023 09:19

mxsasha merged commit ca8f412 into main Jul 13, 2023

mxsasha deleted the txpoll branch July 13, 2023 09:47

mxsasha mentioned this pull request Jul 17, 2023

Improved separation between IRR records and authentication #617

Closed

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new sync method to support #617 using txid poller to fix #656 #811

Add new sync method to support #617 using txid poller to fix #656 #811

mxsasha commented Jul 11, 2023 •

edited

Loading

Add new sync method to support #617 using txid poller to fix #656 #811

Add new sync method to support #617 using txid poller to fix #656 #811

Conversation

mxsasha commented Jul 11, 2023 • edited Loading

mxsasha commented Jul 11, 2023 •

edited

Loading