Skip to content

Conversation

dragomirp
Copy link
Contributor

@dragomirp dragomirp commented Jul 18, 2024

Try to reduce the TLS test flakyness.

Copy link

codecov bot commented Jul 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.87%. Comparing base (2a944d0) to head (34a9606).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #534   +/-   ##
=======================================
  Coverage   70.87%   70.87%           
=======================================
  Files          11       11           
  Lines        3021     3021           
  Branches      535      535           
=======================================
  Hits         2141     2141           
  Misses        764      764           
  Partials      116      116           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dragomirp dragomirp changed the title [DPE-4533] Increase Patroni's loop_wait [DPE-4533] Increase Patroni's loop_wait in TLS test Jul 18, 2024
@dragomirp dragomirp force-pushed the dpe-4533-flaky-test branch from 255defd to e2b9025 Compare July 18, 2024 14:44
@dragomirp dragomirp changed the title [DPE-4533] Increase Patroni's loop_wait in TLS test [DPE-4533] Pause Patroni in the TLS test Jul 24, 2024
@dragomirp dragomirp force-pushed the dpe-4533-flaky-test branch from 2899902 to a75bfa2 Compare July 24, 2024 12:34
@dragomirp dragomirp force-pushed the dpe-4533-flaky-test branch from 3a8f445 to a576119 Compare July 24, 2024 14:10
@dragomirp dragomirp force-pushed the dpe-4533-flaky-test branch from edccdc1 to 03863f4 Compare July 24, 2024 22:03
@dragomirp dragomirp force-pushed the dpe-4533-flaky-test branch from 03863f4 to d5bfb7d Compare July 25, 2024 08:33
@dragomirp dragomirp force-pushed the dpe-4533-flaky-test branch from f903463 to 476c8a9 Compare July 25, 2024 09:27


async def get_patroni_setting(ops_test: OpsTest, setting: str) -> Optional[int]:
async def get_patroni_setting(ops_test: OpsTest, setting: str, tls: bool = False) -> Optional[int]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really need this, but kept for compatibility.

Comment on lines +113 to +114
# Pause Patroni so it doesn't wipe the custom changes
await change_patroni_setting(ops_test, "pause", True, use_random_unit=True, tls=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pause and upause Patroni, instead of trying to tweak the loop_wait

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great solution!

# Pause Patroni so it doesn't wipe the custom changes
await change_patroni_setting(ops_test, "pause", True, use_random_unit=True, tls=True)

async with ops_test.fast_forward("24h"):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disable update_status so it doesn't run while we're tweaking the cluster.

Comment on lines +169 to +176
for attempt in Retrying(stop=stop_after_attempt(10), wait=wait_fixed(5), reraise=True):
with attempt:
logger.info("Trying to grep for rewind logs.")
await run_command_on_unit(
ops_test,
primary,
"grep 'connection authorized: user=rewind database=postgres SSL enabled' /var/snap/charmed-postgresql/common/var/log/postgresql/postgresql-*.log",
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retry for about a minute for the logs to flush.

Comment on lines +178 to +180
await change_patroni_setting(ops_test, "pause", False, use_random_unit=True, tls=True)

async with ops_test.fast_forward():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reenable Patroni and ffwd for the removal test.

@dragomirp
Copy link
Contributor Author

TLS test for amd64 on Juju 3.4.4 succeeded 5 times in a row here

@dragomirp dragomirp marked this pull request as ready for review July 25, 2024 13:23
@dragomirp
Copy link
Contributor Author

Unrelated to the PR, but do we need the test_restart_machine test? It seems to be constantly skipped.

Copy link
Member

@marceloneppel marceloneppel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines +113 to +114
# Pause Patroni so it doesn't wipe the custom changes
await change_patroni_setting(ops_test, "pause", True, use_random_unit=True, tls=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great solution!

@marceloneppel
Copy link
Member

Unrelated to the PR, but do we need the test_restart_machine test? It seems to be constantly skipped.

I think it can be removed nowadays because the issue related to it was solved a long time ago.

Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting approach. 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants