ci: run integration tests serially #1143

james-garner-canonical · 2024-10-07T22:09:23Z

Description

Integration tests are incredibly flakey currently. Perhaps this is related to the parallel execution of the tests. Certainly the fact that the tests are randomly distributed across threads doesn't help with figuring out what works and what doesn't.

QA Steps

Run integration tests and see.

Integration tests are incredibly flakey currently. Perhaps this is related to the parallel execution of the tests. Certainly the fact that the tests are randomly distributed across threads doesn't help with figuring out what works and what doesn't.

james-garner-canonical · 2024-10-07T22:52:52Z

Thinking about this, I suspect the integration tests will hit the 150 minute limit and time out, since the -n auto version uses 4 workers and runs in just under and hour. A better approach will probably be to separate the integration tests out by file into 4+ separate tox environments, each running serially. This would get us deterministic execution, and potentially even a significant speedup in integration test run time depending how many environments we split across.

james-garner-canonical · 2024-10-07T23:55:53Z

Woah, it actually completed. 73 minutes! There sure is a lot of parallelism overhead if 4 workers only gets us down to around 75% of that time (55 minutes).

Three failed tests:

FAILED tests/integration/test_model.py::test_deploy_bundle_with_storage_constraint
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files

test_deploy_bundle_with_overlay_as_argument and test_deploy_bundle_with_multiple_overlays_with_include_files were also both 100% failures across 50 runs of the tests with -n auto, but the third failure, test_deploy_bundle_with_storage_constraint, only failed 16% of the time -- weird.

Also of note is that there's a third test that would fail 100% of the time, but didn't fail here -- test_app_relation_destroy_block_until_done. So that test can pass when run serially, but always fails when run with pytest-xdist's automatic parallelization via -n auto.

dimaqq · 2024-10-08T00:21:02Z

This is clearly an improvement!

james-garner-canonical · 2024-10-08T01:17:50Z

Having run integration tests twice on this PR shows that while this greatly reduces flakiness, we aren't quite at deterministic integration testing yet -- the two runs have two failed tests in common, but each have one additional test fail which passes in the other. I'm going to run tests again to collect additional data, and maybe create a second draft PR to parallelize exploring this.

First run on this PR

# https://github.com/juju/python-libjuju/actions/runs/11224778599/job/31202956855
FAILED tests/integration/test_model.py::test_deploy_bundle_with_storage_constraint
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 3 failed, 111 passed, 37 skipped, 1 warning in 4417.29s (1:13:37) =======

Second run

# https://github.com/juju/python-libjuju/actions/runs/11224778599/job/31205481140
FAILED tests/integration/test_model.py::test_deploy_bundle_local_charm_series_manifest
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 3 failed, 111 passed, 37 skipped, 1 warning in 4314.72s (1:11:54) =======

The third run on this PR got stuck and failed due to timeout, so we've still got that to contend with

# https://github.com/juju/python-libjuju/actions/runs/11224778599/job/31207774444
tests/integration/test_model.py::test_deploy_bundle_local_charms 
[gw0] [ 50%] PASSED tests/integration/test_model.py::test_deploy_bundle_local_charms

First run from #1144

# https://github.com/juju/python-libjuju/actions/runs/11226829336/job/31208051522
FAILED tests/integration/test_crossmodel.py::test_relate_with_offer - juju.er...
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 3 failed, 111 passed, 37 skipped, 1 warning in 4627.02s (1:17:07) =======

We have two very consistent failing tests, but weirdly all 3 completed runs have an extra test failure unique to them.

Common failures: test_deploy_bundle_with_overlay_as_argument, test_deploy_bundle_with_multiple_overlays_with_include_files.

Unique failures: test_deploy_bundle_with_storage_constraint, test_deploy_bundle_local_charm_series_manifest, test_relate_with_offer.

The second run on #1144 runs long and has multiple failures

# https://github.com/juju/python-libjuju/actions/runs/11226829336/job/31210231133
FAILED tests/integration/test_crossmodel.py::test_relate_with_offer - juju.er...
FAILED tests/integration/test_model.py::test_deploy_bundle_local_charm_series_manifest
FAILED tests/integration/test_model.py::test_deploy_bundle_with_storage_constraint
FAILED tests/integration/test_model.py::test_deploy_local_charm - asyncio.exc...
FAILED tests/integration/test_model.py::test_wait_local_charm_blocked - async...
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 7 failed, 107 passed, 37 skipped, 1 warning in 6981.15s (1:56:21) =======

From running the tests in my fork too, we have a first run

# https://github.com/james-garner-canonical/python-libjuju/actions/runs/11224761806/job/31202185686
FAILED tests/integration/test_crossmodel.py::test_relate_with_offer - juju.er...
FAILED tests/integration/test_model.py::test_deploy_bundle_with_storage_constraint
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 4 failed, 110 passed, 37 skipped, 1 warning in 4310.46s (1:11:50) =======

Featuring the two consistent failures and two of the unique ones.

And a second run

# https://github.com/james-garner-canonical/python-libjuju/actions/runs/11226791048/job/31210229626
FAILED tests/integration/test_crossmodel.py::test_relate_with_offer - juju.er...
FAILED tests/integration/test_model.py::test_deploy_bundle_local_charms - asy...
FAILED tests/integration/test_model.py::test_deploy_bundle_with_storage_constraint
FAILED tests/integration/test_model.py::test_deploy_local_charm - asyncio.exc...
FAILED tests/integration/test_model.py::test_wait_local_charm_waiting_timeout
FAILED tests/integration/test_model.py::test_deploy_bundle - requests.excepti...
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multi_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 9 failed, 105 passed, 37 skipped, 1 warning in 6338.12s (1:45:38) =======

Which looks significantly worse ... the two consistent failures, two of the unique ones, and five more ... note also the significantly longer running time

https://github.com/juju/python-libjuju/actions/runs/11226829336/job/31213287003

# https://github.com/juju/python-libjuju/actions/runs/11226829336/job/31213287003
FAILED tests/integration/test_model.py::test_deploy_bundle_with_storage_constraint
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 3 failed, 111 passed, 37 skipped, 1 warning in 4461.86s (1:14:21) =======

https://github.com/juju/python-libjuju/actions/runs/11224778599/job/31212363093

# https://github.com/juju/python-libjuju/actions/runs/11224778599/job/31212363093
FAILED tests/integration/test_model.py::test_deploy_bundle_with_storage_constraint
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 3 failed, 111 passed, 37 skipped, 1 warning in 4474.78s (1:14:34) =======

https://github.com/james-garner-canonical/python-libjuju/actions/runs/11226791048/job/31213202448

# https://github.com/james-garner-canonical/python-libjuju/actions/runs/11226791048/job/31213202448
FAILED tests/integration/test_model.py::test_deploy_bundle_local_charm_series_manifest
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 3 failed, 111 passed, 37 skipped, 1 warning in 4683.80s (1:18:03) =======

https://github.com/james-garner-canonical/python-libjuju/actions/runs/11224761806/job/31213344497

# https://github.com/james-garner-canonical/python-libjuju/actions/runs/11224761806/job/31213344497
FAILED tests/integration/test_model.py::test_deploy_bundle_local_charms - asy...
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 3 failed, 111 passed, 37 skipped, 1 warning in 4943.01s (1:22:23) =======

https://github.com/juju/python-libjuju/actions/runs/11226829336/job/31219977035

# https://github.com/juju/python-libjuju/actions/runs/11226829336/job/31219977035
FAILED tests/integration/test_application.py::test_action - AssertionError: m...
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 3 failed, 111 passed, 37 skipped, 1 warning in 5175.93s (1:26:15) =======

https://github.com/juju/python-libjuju/actions/runs/11224778599/job/31219987411

# https://github.com/juju/python-libjuju/actions/runs/11224778599/job/31219987411
FAILED tests/integration/test_crossmodel.py::test_relate_with_offer - juju.er...
FAILED tests/integration/test_model.py::test_deploy_bundle_local_charms - asy...
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 4 failed, 110 passed, 37 skipped, 1 warning in 4645.05s (1:17:25) =======

https://github.com/james-garner-canonical/python-libjuju/actions/runs/11226791048/job/31219991541

# https://github.com/james-garner-canonical/python-libjuju/actions/runs/11226791048/job/31219991541
FAILED tests/integration/test_model.py::test_deploy_bundle_local_charm_series_manifest
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 3 failed, 111 passed, 37 skipped, 1 warning in 5140.36s (1:25:40) =======

https://github.com/james-garner-canonical/python-libjuju/actions/runs/11224761806/job/31219993294

# https://github.com/james-garner-canonical/python-libjuju/actions/runs/11224761806/job/31219993294
FAILED tests/integration/test_model.py::test_deploy_bundle_local_charms - asy...
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 3 failed, 111 passed, 37 skipped, 1 warning in 4579.61s (1:16:19) =======

https://github.com/james-garner-canonical/python-libjuju/actions/runs/11224761806/job/31260868384

# https://github.com/james-garner-canonical/python-libjuju/actions/runs/11224761806/job/31260868384
FAILED tests/integration/test_crossmodel.py::test_relate_with_offer - juju.er...
FAILED tests/integration/test_model.py::test_deploy_bundle_local_charms - asy...
FAILED tests/integration/test_model.py::test_deploy_bundle_with_storage_constraint
FAILED tests/integration/test_model.py::test_deploy_bundle_with_overlay_as_argument
FAILED tests/integration/test_model.py::test_deploy_bundle_with_multiple_overlays_with_include_files
====== 5 failed, 109 passed, 37 skipped, 1 warning in 4873.29s (1:21:13) =======

james-garner-canonical · 2024-10-08T08:14:22Z

Here are tables summarising the (so far) 15 runs of the serialised tests. I'll probably continue to edit the previous comment with output, and this comment to update the tables.

commit=501cc36b7a1da0bfc329894e71e478dba900dc28
n_jobs=15
n_failing_tests=12

path	test	# jobs	% jobs
test_model.py	test_deploy_bundle_with_multiple_overlays_with_include_files	15	100.00%
test_model.py	test_deploy_bundle_with_overlay_as_argument	15	100.00%
test_model.py	test_deploy_bundle_with_storage_constraint	7	46.67%
test_crossmodel.py	test_relate_with_offer	6	40.00%
test_model.py	test_deploy_bundle_local_charms	5	33.33%
test_model.py	test_deploy_bundle_local_charm_series_manifest	4	26.67%
test_model.py	test_deploy_local_charm	2	13.33%
test_application.py	test_action	1	6.67%
test_model.py	test_deploy_bundle	1	6.67%
test_model.py	test_deploy_bundle_with_multi_overlay_as_argument	1	6.67%
test_model.py	test_wait_local_charm_blocked	1	6.67%
test_model.py	test_wait_local_charm_waiting_timeout	1	6.67%

How many failing tests does each job have?

# tests failing	# jobs
3	**********
4	**
5	*
6
7	*
8
9	*

How many tests fail once, twice, etc?

# fails	# tests
1	*****
2	*
4	*
5	*
6	*
7	*
15	**

james-garner-canonical · 2024-10-08T21:37:27Z

In addition to tests still failing apparently at random, and tests sometimes failing to terminate, the integration test suite can also sometimes fail due to external causes

https://github.com/james-garner-canonical/python-libjuju/actions/runs/11226791048/job/31260866663
Quickly failed with

ERROR cannot deploy controller application: deploying charmhub controller charm: downloading charm "juju-controller" from origin {charm-hub charm 0xc000c9a368 3.4/stable amd64/ubuntu/22.04 }: cannot retrieve "https://api.charmhub.io/api/v1/charms/download/WV4pShb4jnG1eXAB8HMypujHRKRXMRW9_101.charm": cannot get archive: Get "https://canonical-bos01.cdn.snapcraftcontent.com/download-origin/canonical-lgw01/WV4pShb4jnG1eXAB8HMypujHRKRXMRW9_101.charm?token=1728435600_75b8b638a12f49d6f06bf7ebffbb9ad044e99563": read tcp 10.220.190.148:45804->91.189.91.42:443: read: connection reset by peer

https://github.com/juju/python-libjuju/actions/runs/11226829336/job/31260861822
Quickly failed with

ERROR cannot deploy controller application: deploying charmhub controller charm: downloading charm "juju-controller" from origin {charm-hub charm 0xc000c88428 3.4/stable amd64/ubuntu/22.04 }: unexpected EOF

https://github.com/juju/python-libjuju/actions/runs/11224778599/job/31260863660
Quickly failed with

ERROR cannot deploy controller application: deploying charmhub controller charm: downloading charm "juju-controller" from origin {charm-hub charm 0xc000b30068 3.4/stable amd64/ubuntu/22.04 }: unexpected EOF

james-garner-canonical · 2024-10-09T06:12:15Z

Closing in favour of #1149

…ine-and-serialise #1149 Tests in `integration/test_model.py` seem to be flaky even when run serially. All tests in `integration/test_crossmodel.py` are currently skipped, except one which used to be skipped, and is currently flaky even when run serially. This PR: * Serialises all integration tests following #1143 * Skips two tests from `test_model.py` that seem to always fail currently, whether run in serial or in parallel, following #1145 * Moves the flaky tests noted above into a separate job, so that the job running the remaining integration tests will hopefully have a shot at succeeding As a bonus feature, this split of the tests into two runners with `-n 1` seems to be faster than the original method of running all the integration tests in a single runner with `-n auto` (which worked out to be 4 processes on github).

ci: run integration tests serially

501cc36

Integration tests are incredibly flakey currently. Perhaps this is related to the parallel execution of the tests. Certainly the fact that the tests are randomly distributed across threads doesn't help with figuring out what works and what doesn't.

james-garner-canonical mentioned this pull request Oct 7, 2024

chore: disable flakey tests in the default runs #1120

Closed

dimaqq approved these changes Oct 8, 2024

View reviewed changes

james-garner-canonical mentioned this pull request Oct 8, 2024

IGNORE: test running integration tests serially #1144

Closed

dimaqq mentioned this pull request Oct 8, 2024

ci: serial integration tests with those that always fail disabled #1145

Closed

james-garner-canonical mentioned this pull request Oct 9, 2024

ci: serialise integration tests and quarantine problematic tests #1149

Merged

james-garner-canonical closed this Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: run integration tests serially #1143

ci: run integration tests serially #1143

james-garner-canonical commented Oct 7, 2024

james-garner-canonical commented Oct 7, 2024

james-garner-canonical commented Oct 7, 2024

dimaqq commented Oct 8, 2024

james-garner-canonical commented Oct 8, 2024 •

edited

Loading

james-garner-canonical commented Oct 8, 2024 •

edited

Loading

james-garner-canonical commented Oct 8, 2024

james-garner-canonical commented Oct 9, 2024

ci: run integration tests serially #1143

ci: run integration tests serially #1143

Conversation

james-garner-canonical commented Oct 7, 2024

Description

QA Steps

james-garner-canonical commented Oct 7, 2024

james-garner-canonical commented Oct 7, 2024

dimaqq commented Oct 8, 2024

james-garner-canonical commented Oct 8, 2024 • edited Loading

james-garner-canonical commented Oct 8, 2024 • edited Loading

james-garner-canonical commented Oct 8, 2024

james-garner-canonical commented Oct 9, 2024

james-garner-canonical commented Oct 8, 2024 •

edited

Loading

james-garner-canonical commented Oct 8, 2024 •

edited

Loading