-
Notifications
You must be signed in to change notification settings - Fork 27
[DPE-8426] Port Jubilant upgrade and async replication tests #1206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 16/edge
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
❌ Your project check has failed because the head coverage (66.83%) is below the target coverage (70.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## 16/edge #1206 +/- ##
===========================================
- Coverage 66.94% 66.83% -0.11%
===========================================
Files 17 17
Lines 4347 4360 +13
Branches 671 675 +4
===========================================
+ Hits 2910 2914 +4
- Misses 1242 1249 +7
- Partials 195 197 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
98cb060
to
85ae99f
Compare
85ae99f
to
76fe647
Compare
eff6861
to
744dd63
Compare
744dd63
to
1eb046e
Compare
9fa9066
to
c35a605
Compare
8eaec8a
to
731f8b9
Compare
import subprocess | ||
from collections.abc import Callable | ||
|
||
import jubilant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using regular jubilant, since PG 16 doesn't need juju 2 support.
logging.info("checking the number of switchovers") | ||
final_number_of_switchovers = count_switchovers(juju, DB_APP_NAME) | ||
assert (final_number_of_switchovers - initial_number_of_switchovers) <= 2, ( | ||
"Number of switchovers is greater than 2" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mysql checks that the DB primary is moved to the Juju leader. We checked for number of switchovers.
908f94e
to
0d7bc5f
Compare
93ad21d
to
be9df49
Compare
105a1f5
to
c9f5501
Compare
fda46d2
to
239f7f3
Compare
base_patch = {} | ||
if primary_endpoint := self.async_replication.get_primary_cluster_endpoint(): | ||
base_patch["standby_cluster"] = {"host": primary_endpoint} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standby cluster is DCS value, so not updated by just changing the config. Should I also add this to the async relation changed hook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have a strict opinion here. Up to Marcelo.
with attempt: | ||
current_primary = self.get_primary() | ||
current_primary = ( | ||
self.get_primary() if not async_cluster else self.get_standby_leader() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the async cluster we have a standby leader.
for attempt in Retrying(stop=stop_after_attempt(2), wait=wait_fixed(1), reraise=True): | ||
with attempt: | ||
if not self._charm._patroni.are_all_members_ready(): | ||
raise charm_refresh.PrecheckFailed("PostgreSQL is not running on 1+ units") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When moving the standby leader, the status becomes stopped for a bit, so the action will fail if the standby leader is not already the lowest unit.
if current_primary == candidate: | ||
logger.info("Candidate and leader are the same") | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got conflicts in the standby cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankenstein between the existing Postgresql test and the one in Mysql.
@pytest.fixture(scope="module") | ||
def first_model(juju: Juju, request: pytest.FixtureRequest) -> Generator: | ||
"""Creates and return the first model.""" | ||
yield juju.model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fails to cleanup when running locally on a Jubilant temp model.
|
||
|
||
@pytest.fixture() | ||
def first_model_continuous_writes(first_model: str) -> Generator: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how pytest resolves fixtures, so used a different name than the one in conftest.
model_2.deploy( | ||
charm=DB_TEST_APP_NAME, | ||
app=DB_TEST_APP_2, | ||
base="ubuntu@22.04", | ||
channel="latest/edge", | ||
num_units=1, | ||
constraints=constraints, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PGB can't switch over between cluster, so using two test apps like in the previous test.
for attempt in Retrying(stop=stop_after_attempt(10), wait=wait_fixed(3), reraise=True): | ||
with attempt: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are waiting for the config to be patched and events to fire, for the replication to resume.
|
||
|
||
def get_db_max_written_values( | ||
first_model: str, second_model: str, test_model: str, test_app: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to know which of the models the test app is running in, to stop it.
) | ||
model_2 = Juju(model=second_model) | ||
model_2.deploy( | ||
charm=charm, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-refresh check will be ran on the old code, so it will fail on edge (leader
vs. standby leader
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. This is massive!
base_patch = {} | ||
if primary_endpoint := self.async_replication.get_primary_cluster_endpoint(): | ||
base_patch["standby_cluster"] = {"host": primary_endpoint} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have a strict opinion here. Up to Marcelo.
Checklist