Skip to content

Conversation

dragomirp
Copy link
Contributor

@dragomirp dragomirp commented Nov 14, 2024

Set all replicas to sync

Copy link

codecov bot commented Nov 14, 2024

Codecov Report

Attention: Patch coverage is 65.00000% with 7 lines in your changes missing coverage. Please review.

Project coverage is 71.96%. Comparing base (7e79b64) to head (acb98f8).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
src/cluster.py 55.55% 3 Missing and 1 partial ⚠️
src/charm.py 62.50% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #672      +/-   ##
==========================================
- Coverage   72.02%   71.96%   -0.07%     
==========================================
  Files          15       15              
  Lines        3482     3474       -8     
  Branches      535      532       -3     
==========================================
- Hits         2508     2500       -8     
- Misses        844      845       +1     
+ Partials      130      129       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dragomirp dragomirp marked this pull request as ready for review November 15, 2024 20:40
src/cluster.py Outdated
r = requests.patch(
f"{self._patroni_url}/config",
json={"synchronous_node_count": units // 2},
json={"synchronous_node_count": units - 1},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REST calls should also use the new value.

@dragomirp dragomirp requested review from a team, taurus-forever, marceloneppel and lucasgameiroborges and removed request for a team November 15, 2024 21:09
Copy link

@lucasgameiroborges lucasgameiroborges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, but why are we setting all nodes to be sync replicas? From the mentioned ticket it should come from a config value?

@dragomirp
Copy link
Contributor Author

Code LGTM, but why are we setting all nodes to be sync replicas? From the mentioned ticket it should come from a config value?

What was discussed was to switch the default behaviour to all sync and add a config for controlling the amount of syncs. The config needs a spec. I'll comment on the ticket to clarify.

Copy link
Member

@marceloneppel marceloneppel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Testing it, it's working fine in almost all scenarios.

The scenario where it's not working yet is the upgrade:

juju deploy postgresql --channel 14/stable -n 3

# Wait for the units to settle down, then:
curl UNIT-ZERO-IP:8008/cluster | jq # There is one leader, one sync-standby and one replica, as expected.

juju run postgresql/leader pre-upgrade-check

juju refresh --path ./*.charm postgresql

# Wait for the units to settle down, then:
curl UNIT-ZERO-IP:8008/cluster | jq # There is one leader, one sync-standby and one replica, but we should have one leader and two sync-standby only.

juju add-unit postgresql -n 1

# Wait for the units to settle down, then:
curl UNIT-ZERO-IP:8008/cluster | jq # There is one leader, and three sync-standby, as expected.

Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, need some time to play it without rush.
Let's merge it after release the current PostgreSQL VM to stable.

@dragomirp
Copy link
Contributor Author

The scenario where it's not working yet is the upgrade:

5082317 should fix the upgrade. Not adding an assertion to the integration tests because functionality will drift when the changes land on edge and stable.

Copy link
Member

@marceloneppel marceloneppel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scenario where it's not working yet is the upgrade:

5082317 should fix the upgrade. Not adding an assertion to the integration tests because functionality will drift when the changes land on edge and stable.

Thanks, Dragomir! LGTM!

@dragomirp dragomirp merged commit 87f0755 into main Feb 13, 2025
93 checks passed
@dragomirp dragomirp deleted the dpe-5827-all-sync branch February 13, 2025 14:41
dragomirp added a commit that referenced this pull request Mar 13, 2025
* Use `charmcraft test` & concierge (#762)

* Update charmcraft.yaml build tools (#760)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* [DPE-6020] Better promote-to-primary unit scope error handling (#759)

* Bump libs

* Flip default scope

* Better action failure

* Wrong attr

* Revert scope

* Bump libs

* Handle async replica switchover

* Unit tests

* Bump cosl

* Disable Nextcloud test (#767)

* Update canonical/data-platform-workflows action to v30 (#770)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Use _promote_charm.yaml (#771)

Use `charmcraft promote` and auto-generate release notes

* [DPE-5827] Set all nodes to synchronous replicas (#672)

* Set all nodes to synchronous replicas

* Fix template var

* Also change config patching

* Update sync nodes during upgrade

* Revert are_writes_increasing changes

* Add back logging

* Try without logs

* Tactical sleep

* Log removal error

* Remove logs

* Tweak replication test

* Pass down unit

* Wait for test app to idle

* Add comment

* Port config changes

* Copy policy test

* Fix import

* Missed param removal

* Unit test

* Missing attr

* Add logs

* Add timeout to connection

* Log conn str

* Fix num of standbys

* Charm fixture

* Remove stepdown hook

* Config description

* Revert conn str

* Add async scaling test

* Typo

* Don't remove standby and primary

* Update dependency psutil to v7 (#772)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update dependency cryptography to v44.0.1 [SECURITY] (#764)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update canonical/data-platform-workflows action to v30.0.2 (#765)

* Update canonical/data-platform-workflows action to v30.0.2

* Update promote.yaml

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>

* [DPE-6323] Handle missing stanza output (#727)

* Handle missing stanza output

* Update libs

* Unit tests

* Update canonical/has-signed-canonical-cla action to v2 (#773)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* [MISC] Define charm constants (#774)

* Lock file maintenance Python dependencies (#743)

* Lock file maintenance Python dependencies

* Backoff boto3 1.36

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Dragomir Penev <dragomir.penev@canonical.com>

* Update charmcraft.yaml build tools (#768)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update canonical/data-platform-workflows action to v30.1.3 (#776)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update dependency uv to v0.6.3 (#780)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* [MISC] Sanitize PostgreSQL extra-user-roles arg (#782)

* [MISC] Fix PostgreSQL lib function signature (#786)

* [MISC] Skip backup and subordinate tests without creds (#789)

* Bump libs

* Skip backup tests without creds

* Skip subordinate tests

* Update tests/integration/test_subordinates.py

Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>

---------

Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>

* Update dependency jinja2 to v3.1.6 [SECURITY] (#788)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Disable cache

* Reduce required approvals on Renovate pull requests by 1 (#787)

* Sync docs from Discourse (#748)

Co-authored-by: GitHub Actions <41898282+github-actions[bot]@users.noreply.github.com>

* Cleanup juju 2 tests

* Linting

* Integration test diffs

* Try with series for ubuntu pro subordinate

* Filter terminated units

* Bump PG version

* Disable pgaudit for timescale and postgis

* Linting

* Remove tests

* Remove param for secrets

* Linting

* Idle when disabling pgaudit

* Actually disable audit

* Disable timescale in object test

* Try to disable plugins between tests

* Update canonical/data-platform-workflows action to v30.2.0 (#792)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Try to disable pgaudit in general

---------

Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Sinclert Pérez <sinclert.perez@canonical.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
dragomirp added a commit that referenced this pull request Mar 19, 2025
* WIP: enable pg_tracing

* adapt render patroni function

* Move pg_tracing conf to the correct location

* fix lint and unit tests

* use ubuntu 24.04 as base

* fix issues + revert base to jammy

* fix queries for plugin testing

* downgrade psycopg2 version

* update lock file

* update psycopg2-binary too

* revery psycopg2 versions and add ssl params

* revert lock

* use noble as base

* use different workflow version

* remove cache

* edit plugin

* fix integration tests

* revert pgtracing config + add new snaps

* try fixes

* try new fixes

* try fixing restart

* try catching reload errors

* adapt test_subordinates to remove ubuntu pro charm

* try use newer branch for workflow

* fix lock hash

* use new branch for plugin too

* remove old refs to pg 14

* specify cc version 3 on release workflow

* remove juju 2.9 + refactor release workflow

* remove libjuju constraint

* fix release too

* use new charm + small adjustments

* fix lock file

* fix build_charm issue

* remove base from deploy calls

* nits

* Use `charmcraft test` & concierge (#762)

* Update charmcraft.yaml build tools (#760)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* [DPE-6020] Better promote-to-primary unit scope error handling (#759)

* Bump libs

* Flip default scope

* Better action failure

* Wrong attr

* Revert scope

* Bump libs

* Handle async replica switchover

* Unit tests

* Bump cosl

* Disable Nextcloud test (#767)

* Update canonical/data-platform-workflows action to v30 (#770)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Use _promote_charm.yaml (#771)

Use `charmcraft promote` and auto-generate release notes

* [DPE-5827] Set all nodes to synchronous replicas (#672)

* Set all nodes to synchronous replicas

* Fix template var

* Also change config patching

* Update sync nodes during upgrade

* Revert are_writes_increasing changes

* Add back logging

* Try without logs

* Tactical sleep

* Log removal error

* Remove logs

* Tweak replication test

* Pass down unit

* Wait for test app to idle

* Add comment

* Port config changes

* Copy policy test

* Fix import

* Missed param removal

* Unit test

* Missing attr

* Add logs

* Add timeout to connection

* Log conn str

* Fix num of standbys

* Charm fixture

* Remove stepdown hook

* Config description

* Revert conn str

* Add async scaling test

* Typo

* Don't remove standby and primary

* Update dependency psutil to v7 (#772)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update dependency cryptography to v44.0.1 [SECURITY] (#764)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update canonical/data-platform-workflows action to v30.0.2 (#765)

* Update canonical/data-platform-workflows action to v30.0.2

* Update promote.yaml

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>

* [DPE-6323] Handle missing stanza output (#727)

* Handle missing stanza output

* Update libs

* Unit tests

* Update canonical/has-signed-canonical-cla action to v2 (#773)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* [MISC] Define charm constants (#774)

* Lock file maintenance Python dependencies (#743)

* Lock file maintenance Python dependencies

* Backoff boto3 1.36

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Dragomir Penev <dragomir.penev@canonical.com>

* Update charmcraft.yaml build tools (#768)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update canonical/data-platform-workflows action to v30.1.3 (#776)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update dependency uv to v0.6.3 (#780)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* [MISC] Sanitize PostgreSQL extra-user-roles arg (#782)

* [MISC] Fix PostgreSQL lib function signature (#786)

* [MISC] Skip backup and subordinate tests without creds (#789)

* Bump libs

* Skip backup tests without creds

* Skip subordinate tests

* Update tests/integration/test_subordinates.py

Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>

---------

Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>

* Update dependency jinja2 to v3.1.6 [SECURITY] (#788)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Reduce required approvals on Renovate pull requests by 1 (#787)

* Sync docs from Discourse (#748)

Co-authored-by: GitHub Actions <41898282+github-actions[bot]@users.noreply.github.com>

* Update canonical/data-platform-workflows action to v30.2.0 (#792)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update dependency uv to v0.6.5 (#785)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Pg 16 sync main (#793)

* Use `charmcraft test` & concierge (#762)

* Update charmcraft.yaml build tools (#760)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* [DPE-6020] Better promote-to-primary unit scope error handling (#759)

* Bump libs

* Flip default scope

* Better action failure

* Wrong attr

* Revert scope

* Bump libs

* Handle async replica switchover

* Unit tests

* Bump cosl

* Disable Nextcloud test (#767)

* Update canonical/data-platform-workflows action to v30 (#770)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Use _promote_charm.yaml (#771)

Use `charmcraft promote` and auto-generate release notes

* [DPE-5827] Set all nodes to synchronous replicas (#672)

* Set all nodes to synchronous replicas

* Fix template var

* Also change config patching

* Update sync nodes during upgrade

* Revert are_writes_increasing changes

* Add back logging

* Try without logs

* Tactical sleep

* Log removal error

* Remove logs

* Tweak replication test

* Pass down unit

* Wait for test app to idle

* Add comment

* Port config changes

* Copy policy test

* Fix import

* Missed param removal

* Unit test

* Missing attr

* Add logs

* Add timeout to connection

* Log conn str

* Fix num of standbys

* Charm fixture

* Remove stepdown hook

* Config description

* Revert conn str

* Add async scaling test

* Typo

* Don't remove standby and primary

* Update dependency psutil to v7 (#772)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update dependency cryptography to v44.0.1 [SECURITY] (#764)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update canonical/data-platform-workflows action to v30.0.2 (#765)

* Update canonical/data-platform-workflows action to v30.0.2

* Update promote.yaml

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>

* [DPE-6323] Handle missing stanza output (#727)

* Handle missing stanza output

* Update libs

* Unit tests

* Update canonical/has-signed-canonical-cla action to v2 (#773)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* [MISC] Define charm constants (#774)

* Lock file maintenance Python dependencies (#743)

* Lock file maintenance Python dependencies

* Backoff boto3 1.36

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Dragomir Penev <dragomir.penev@canonical.com>

* Update charmcraft.yaml build tools (#768)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update canonical/data-platform-workflows action to v30.1.3 (#776)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Update dependency uv to v0.6.3 (#780)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* [MISC] Sanitize PostgreSQL extra-user-roles arg (#782)

* [MISC] Fix PostgreSQL lib function signature (#786)

* [MISC] Skip backup and subordinate tests without creds (#789)

* Bump libs

* Skip backup tests without creds

* Skip subordinate tests

* Update tests/integration/test_subordinates.py

Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>

---------

Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>

* Update dependency jinja2 to v3.1.6 [SECURITY] (#788)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Disable cache

* Reduce required approvals on Renovate pull requests by 1 (#787)

* Sync docs from Discourse (#748)

Co-authored-by: GitHub Actions <41898282+github-actions[bot]@users.noreply.github.com>

* Cleanup juju 2 tests

* Linting

* Integration test diffs

* Try with series for ubuntu pro subordinate

* Filter terminated units

* Bump PG version

* Disable pgaudit for timescale and postgis

* Linting

* Remove tests

* Remove param for secrets

* Linting

* Idle when disabling pgaudit

* Actually disable audit

* Disable timescale in object test

* Try to disable plugins between tests

* Update canonical/data-platform-workflows action to v30.2.0 (#792)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Try to disable pgaudit in general

---------

Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Sinclert Pérez <sinclert.perez@canonical.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Dual branch configs

* Revert lib changes

* Remove legacy rels

* Restore pydantic rule

* Remove legacy rels metadata

* Remove manual dispatch

* Create schema to test admin user privileges

* Cleanup markers

* Workaround for cluster restore test

* Promote permadiff

* Apply suggestions from code review

Co-authored-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>

* Bump snaps

---------

Co-authored-by: Shayan Patel <shayan.patel@canonical.com>
Co-authored-by: Lucas Gameiro Borges <lucas.borges@canonical.com>
Co-authored-by: Carl Csaposs <carl.csaposs@canonical.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Dragomir Penev <6687393+dragomirp@users.noreply.github.com>
Co-authored-by: Sinclert Pérez <sinclert.perez@canonical.com>
Co-authored-by: Dragomir Penev <dragomir.penev@canonical.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants