-
Notifications
You must be signed in to change notification settings - Fork 27
Add test_smoke
test
#420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test_smoke
test
#420
Conversation
* Change charm database user * Fix unit tests * Fix integration test call * Fix user name in library * Fix user * Add default postgres user creation * Change action name
* Change charm database user * Fix unit tests * Fix integration test call * Fix user name in library * Fix user * Add default postgres user creation * Change action name * Rework secrets management
* Change charm database user * Fix unit tests * Fix integration test call * Fix user name in library * Fix user * Add default postgres user creation * Change action name * Rework secrets management * Add password rotation logic * Add user to the action parameters * Add pytest marks * Fix action description * Fix method docstring * Fix pytest mark * Update src/charm.py Co-authored-by: Raúl Zamora Martínez <76525382+zmraul@users.noreply.github.com> Co-authored-by: Raúl Zamora Martínez <76525382+zmraul@users.noreply.github.com>
ops 1.5.2 introduced a fix to the harness' handling for departing unit tracking for events. This uncovered a subtle error in the tests here. remove_relation was triggering the "departing" relation data key to be inserted into the local charm's data because the local charm had been added to the relation units list (which is not really how that is intended to be used). This was causing the delete_user method to be called earlier than intended and caused assertion failures in the on_relation_broken test. The fix is to not add the local unit to the relation units. But then the relation departed test that checks for the "departing" data key wasn't working. So for now, we can add the local unit to the relation in that test only.
* Change charm database user * Fix unit tests * Fix integration test call * Fix user name in library * Fix user * Add default postgres user creation * Change action name * Rework secrets management * Add password rotation logic * Add user to the action parameters * Add separate environments for different tests * Add all dependencies * Add pytest marks * Fix action description * Fix method docstring * Fix pytest mark * Fix pytest markers * Register pytest markers
* Change charm database user * Fix unit tests * Fix integration test call * Fix user name in library * Fix user * Add default postgres user creation * Change action name * Rework secrets management * Add password rotation logic * Add user to the action parameters * Add separate environments for different tests * Add all dependencies * Add pytest marks * Fix action description * Fix method docstring * Fix pytest mark * Fix pytest markers * Register pytest markers * Import files * Update libraries * Update library * Fix PostgreSQL library * Add jsonschema as a binary dependency * Reorganize dependencies and aff f-string * Reorganize user definition
* Add TLS implementation * Delete file * Update library * Fix PostgreSQL library * Add jsonschema as a binary dependency * Change hostname to unit ip * Add unit test dependency * Call certificate update on config change * Fix docstring * Change log call
* Add integration test * Update library * Fix PostgreSQL library * Add relation broken test * Add jsonschema as a binary dependency * Change hostname to unit ip * Add unit test dependency * Add check for encrypted connections * Fix space and None check
* Add workaround for Patroni REST API TLS * Improve TLS status retrieval * Readd checks * Add additional check
* Add TLS to README * Add password rotation to README
* Add continuous writes test charm * Add comment
* Add kill db process test * Change timeout * Ignore PyCharm folder and LXD profile file * Add continuous writes test charm * Improve workflow files
Avoid a long-running integration test in case of failing gatekeeping tests. It will slightly increase the complete tests scope runtime but will save (a lot?) of electricity/money for Canonical as often new pull requests have some initial typos/issues to be polished.
DPE-781 Run integration tests for passed lint/unit tests only
* Add alternative servers for primary and members retrieval * Test working * Test working * Cleanup the code * More cleanup * Small adjustments * Add unit tests * Improve comments * Use down unit * Improve alternative URL description * Add additional checks * Improve returns
* fix early tls deployment by only reloading patroni config if it's already running * added unit test for reloading patroni * lint * removing postgres restart check * adding series flags to test apps * adding series flags to test apps * made series into a list * Update test_new_relations.py * updating test to better emulate bundle deploymen * updated tls lib * Fix TLS IP address on CSR * removed failing test for now. This will be fixed before merge * Fix TLS * Update TLS lib * Remove unneeded series * Add comment * Add Juju agent version bootstrap constraint * Add test for restart method * Update TLS lib * Add test for update config method * Update TLS lib * Improve comment Co-authored-by: WRFitch <will.fitch@canonical.com> Co-authored-by: Will Fitch <WRFitch@outlook.com>
* Add restart db process test * Change restart delay * Update test code * Change processes that should be tested * Re enable tests * Remove unused code * Remove unused import * Remove unused code * Add docstring * Remove additional tests * Rollback patroni version
* Add SST test * Enable previous tests * fix early tls deployment by only reloading patroni config if it's already running * Improve code * Remove duplicate check * Remove unused import * added unit test for reloading patroni * lint * removing postgres restart check * Pin Juju agent version on CI * adding series flags to test apps * adding series flags to test apps * made series into a list * Update test_new_relations.py * Add retrying * updating test to better emulate bundle deploymen * Remove unused code * Change processes list * Add logic for ensuring all units down * Change delay to only one unit * Add WAL switch * Updates related to WAL removal * Small improvements * Add comments * Change the way service is stopped * Remove slot removal * Small fixes * Remove unussed parameter Co-authored-by: WRFitch <will.fitch@canonical.com> Co-authored-by: Will Fitch <WRFitch@outlook.com>
* Add restart db process test * Change restart delay * Pin Juju agent version on CI * Update test code * Change processes that should be tested * Re enable tests * Remove unused code * Remove unused import * Remove unused code * Add docstring * Fix Patroni API calls * Fix kill db process test error * Improve comments * Remove unused code * Revert change * Unify tests * Replace ps aux | grep with pgrep * Readd changes removed on merge
* Prevent relation extensions * Integration test
* Pin OS on release workflow * Change whitelist_externals to allowlist_externals
* Pin OS on release workflow * Change whitelist_externals to allowlist_externals * Update PostgreSQL library * Add integration test for the fix from the library
…cal#32) * Add pgbackrest * Add initial backup settings * Add additional backup settings * Remove settings * Rename user * Add test for TLS being used on pg_rewind connections * Remove table creation * Readd write to the database * Change bootstrap contraints * Change the way the service is stopped * Add one more call to service stop * Change the way the service is stopped * Change systemd unit * Increase test timeout * Remove test code * Read code * Readd code * Change WAL trigger mechanism * Fix check * Improve code * Remove instance promotion * Readd instance promotion * Change test retry logic * Remove debug calls * Add replica reinitialization * Change checks order * Add reinitialize call * Improve reinitialize call * Remove unused code * Pin OS on release workflow * Change whitelist_externals to allowlist_externals * Change whitelist_externals to allowlist_externals * Add API request timeout * Add unit tests * Remove log * Revert timeout * Extract endpoint from URL to constant
* Unpinning libjuju * Update pinned flake8 version * Test tweak * Pinning juju 2.9.38.1 * Tweaking CSR check
* Fix service restart at boot * Add fix for TLS files and postgres user * Add integration test * Fix unit test * Fix ubuntu version and update charming actions * Move test * Remove previous test * Revert change on unit test * Minor fixes * Fix unit tests * Modify condition * Update Juju version * Setup tmate session * Add Waiting Status * Add defer call * Change reboot command * Add waiting status * Remove incorrect defer call * Add replica related changes * Simplify test * Simplify code and test * Add comment about latest juju 2 * Update charming actions * Make test optional * Add unit tests * Add docstring and type hints * Some fixes * Minor fixes * Remove prints * Fix bundle deployment * Add additional unit tests
* Fix unit tests * Update postgresql_tls lib * Readding test_on_relation_broken_extensions_keep_block * Code review improvements
* Fix error that is raised when calling change WAL settings * Remove unused import * Add reformatted file
* ops and charm libs version bump * Update flake8
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Hi @BON4! Thanks for the contribution. Please add a PR description so we can understand what you're proposing. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed today, please add link to the GH/Jira issue. We always must link commits to bugtracker.
Please add PR description: What? Why? How?
Please improve the test to check charm and not juju :-)
Tnx!
# Check if storage exists after application deployed | ||
assert await is_storage_exists(ops_test, storage_id_str) | ||
|
||
await ops_test.model.remove_application(APPLICATION_NAME, block_until_done=True, force=True, no_wait=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, this will remove storage (as it is removed by default on LXD with hostpath storage type (default)), so the next text will be deployed again without any storage. Consider to add destroy_storage=False (but I am not sure if it help).
# Create test database to check there is no resouces conflicts | ||
await create_db(ops_test, APPLICATION_NAME, TEST_DATABASE_RELATION_NAME) | ||
|
||
# Deploy duplicaate charm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on the sync today: we are testing Juju here, but we need to test charm.
Juju guaranties that independent applications will have interdependent storages, no need to test it here, IMHO. But you can keep it as already created as it should be fast.
Our goal is to test charm:
- charm A can start with storage from A (cleanly). Even if there are some garbage lock files on storage.
- charm A can start with storage from B (with warning in log, about foreign storage, but deploying anyway as it can be an app recovery process).
P.S. the goal of smoke test is a fast result, so try to keep the test fast (in time).
test_config_parameters
test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one comment to polish comments... triggering tests to see the test results.
# Check that test database is not exists for duplicate application | ||
assert not await check_db(ops_test, APPLICATION_NAME, TEST_DATABASE_RELATION_NAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says duplicate application
but used APPLICATION_NAME and not DUP_APPLICATION_NAME. Is it OK? IMHO, comment is outdated and confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
logger.info(f"Application is not in blocked state. Checking logs...") | ||
|
||
# Since application have postgresql db in storage from external application it should not be able to connect due to new password | ||
assert not await check_password_auth(ops_test, ops_test.model.applications[DUP_APPLICATION_NAME].units[0].name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JFMI: does charm start with foreign storage/data or just stuck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This charm stalls forever in Init:CrashLoopBackOff loop with "waiting, installing agent". As I know, this commit will fix this issue, and deployment will be in blocked
state instead of falling in to forever CrashLoopBackOff due to database from another application.
@dragomirp @marceloneppel bumping |
Hi, @BON4, please sign the CLA and sync up with the main branch. |
|
||
@pytest.mark.group(1) | ||
@pytest.mark.abort_on_fail | ||
async def test_app_force_removal(ops_test: OpsTest, charm: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO there is no need to test this either. With force removal the charm code shouldn't even fire, so this would be handled by Juju entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. It makes more sense only to test whether the detached storage left from this test can be reused in a new deployment, as you did in the next tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@taurus-forever suggested to leave test_app_force_removal
just for the sake of testing the correctness of storage detaching (charm will be removed with force correctly). Since this is smoke test, it should be fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution, @BON4! I requested some changes. I also agree with Dragomir's comments.
storage_id, | ||
"--format=json", | ||
] | ||
print(f"command: {complete_command}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to logger.info
or logger.debug
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is removed.
@pytest.mark.group(1) | ||
@pytest.mark.abort_on_fail | ||
async def test_app_removal(ops_test: OpsTest, charm: str): | ||
"""Test all recoureces is removed after application removal.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Test all recoureces is removed after application removal.""" | |
"""Test all resources are removed after application removal.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
garbadge_storage = await get_any_deatached_storage(ops_test) | ||
assert garbadge_storage is not None | ||
|
||
assert garbadge_storage is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't it fail in L164 and stop the test? I think you can use reraise=True
in the Retrying
object to show the assertion error in the logs. I think the same applies to L127.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Used reraise=True
in all Retrying loops.
|
||
@pytest.mark.group(1) | ||
@pytest.mark.abort_on_fail | ||
async def test_app_recoures_conflicts(ops_test: OpsTest, charm: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async def test_app_recoures_conflicts(ops_test: OpsTest, charm: str): | |
async def test_app_resources_conflicts(ops_test: OpsTest, charm: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIxed
|
||
try: | ||
await ops_test.model.wait_for_idle( | ||
apps=[DUP_APPLICATION_NAME], timeout=500, status="blocked" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what's the message of this blocked status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not yet implemented in operator but this commit will bring changes, so that application will fall in blocked state with "Failed to start postgresql. The storage belongs to a third-party cluster"
message. I added check for operator auth down below just to double check and in case this test will be merged before the commit for blocked status.
# Since application have postgresql db in storage from external application it should not be able to connect due to new password
assert not await check_password_auth(
ops_test, ops_test.model.applications[DUP_APPLICATION_NAME].units[0].name
)
|
||
@pytest.mark.group(1) | ||
@pytest.mark.abort_on_fail | ||
async def test_app_force_removal(ops_test: OpsTest, charm: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. It makes more sense only to test whether the detached storage left from this test can be reused in a new deployment, as you did in the next tests.
@dragomirp @marceloneppel @taurus-forever Hello team, bumping for review and re-run checks, after changes applied. |
383970a
to
db46208
Compare
Issue #441
All this cases should be covered:
--force --no-wait
, storage should be detached.