Skip to content

Conversation

marceloneppel
Copy link
Member

@marceloneppel marceloneppel commented Apr 22, 2024

Issue

It's not possible to replicate data between regions.

Solution

Implement cross-region async replication. This PR is a rebranded and more stable version of #317.

With this PR, it's no longer necessary to remove the relation and relate again when a switchover is needed.

Also, the names of the relations can easily be changed to others, like cluster-one and cluster-two, for example, to avoid confusing users.

This is mostly a copy and paste from canonical/postgresql-k8s-operator#447.

Important changes:

  • src/relations/async_replication.py contains the logic to make one cluster the primary and the other the standby. To make the standby cluster follow the primary cluster, the candidate for the standby cluster needs to be restarted.

    • The most important part of the logic is handled by the _on_async_relation_changed, which takes care of restarting the standby cluster units databases in order to make them replicate data from the primary cluster.

    • Additionally to everything implemented on [DPE-2897] Cross-region async replication postgresql-k8s-operator#447, this PR adds the coordination of which RAFT members should be part of the Patroni internal RAFT cluster while starting the units in the standby cluster. It's done through the return value from the get_partner_addresses method.

  • Passwords update will be implemented in another PR, as this one is already huge.

  • If the standby cluster has its relation removed, it goes to a read-only mode and can be promoted later to a normal cluster through the promote-cluster action.

How to deploy: https://discourse.charmhub.io/t/charmed-postgresql-deploy-async-replication/13991
How to trigger a switchover: https://discourse.charmhub.io/t/charmed-postgresql-async-switchover/13993

Additional instructions:

Integration tests: #453

Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
@marceloneppel marceloneppel marked this pull request as ready for review April 23, 2024 04:42
Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but one critical question about patroni.yaml perms.

…ation

Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
…ation

Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
@marceloneppel marceloneppel merged commit 448a1a6 into main May 3, 2024
@marceloneppel marceloneppel deleted the dpe-2953-async-replication branch May 3, 2024 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants