-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CI stability #6625
Comments
Does this test include #6591? 1.I see that you ran on fjetter/distributed. 2.I strongly suspect a lot of failures may be related to #6271. 2b.I think we could figure out a straightforward way to mass-skip all tests decorated with |
yes
the dask project is also on free tier, isn't it? |
Most of the windows tests are failing because of a disk permission problem during cleanup. @graingert suggested that using the pytest fixtures instead of tempfile would help with this. |
New update based on a8eb3b2 using branch https://github.com/fjetter/distributed/tree/stress_ci again running 10 iterations, see fjetter@3fe45b4 Test reports and summary was generated with code available here fjetter@6be7790 Test report is a again available at https://gistpreview.github.io/?ecc2cdddf651df9ee0c7966e210c9093/a8eb3b23b8fe91f52758db155e7151e3d516cbdc.html
On average, every full test matrix included 2.7 failures. There was no full test matrix successful. We can observe, again, a couple of systematic problems
|
New update based on e1b9e20 using branch https://github.com/fjetter/distributed/tree/stress_ci High level summary
We're performing already much better; with the exception of OSX py3.8 where not a single run was successful. Below a few detailed reports about the kinds of errors encountered Detailed error reportThis is a groupby on the truncated error message as a proxy for a fuzzy match
Full error messages
|
The timeout errors in row three appear to throw a couple of |
With a couple of recent merges, I triggered yesterday another "CI stress test" that runs our suite a couple of times in a row (this time 10)
see https://github.com/fjetter/distributed/tree/stress_ci
which is based on dc019ed
with fjetter@68689f0 on top
The results of this test run can be seen https://github.com/fjetter/distributed/runs/7029246894?check_suite_focus=true
Summary
We had overall 80 total jobs spread on the different OSs and python versions of which 32 failed.
If we look at an entire test run, i.e. a full test matrix for a given run number, not a single job would've been successful.
Looking at the kinds of test failures, we will see that three jobs on windows failed due to a GH actions test timeout of 120s. And two test runs where cancelled by github without further information, also on windows.
The timing out test runs do not have anything obvious in common. In fact, one of the three timed out tests appears to have finished running the pytest suite but still timed out.
A modified test report available here https://gistpreview.github.io/?ecc2cdddf651df9ee0c7966e210c9093
The text was updated successfully, but these errors were encountered: