-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Volatility introduced in tests since approx September 18th - potentially package sync #446
Comments
Comparing conda environments between pre and post merge, the most interesting diff is coiled itself which jumped from 0.2.27 to 0.2.30 but I only compared two runs. I'd be interested to see a comparison of more commits, particularly the spikes. |
Two thoughts.
|
It's very hard to find data for old tests. So here's what I can get without ridiculous amounts of work. Here's a https://cloud.coiled.io/dask-engineering/clusters/83375/details That cluster has py3.9 and dask/distributed 2022.6.0. You can follow that link to see other software versions. Here's cluster from before that has py3.9 and dask/distributed 2022.6.0: https://cloud.coiled.io/dask-engineering/clusters/75547/details I see changes in pandas and pyarrow. Would that be relevant? Anyone else is welcome to dig in more comparing those clusters, I don't have much more insight. |
I've identified a couple of issues with package sync, in both detecting packages on cluster and client side. Staging has a ton of bugfixes i'm eager to get out but it's held up currently by ensuring we've tested some of the infra work. |
Any reason to think package sync bugs are causing this? |
I suspect it might be causing release versions to be installed instead of main branch. If this is happening even on legacy tests, then it's unlikely. |
Yeah, you can see the change on clusters running The examples I listed #446 (comment) are runtime 0.1.0 w/ py39. |
@shughes-uk Should we have an "environment" in the runtime where we test against staging? |
If we can identify a subsection of the tests for this it might be worthwhile, otherwise the cost might be prohibitive |
vorticity also shows non-trivial variability |
Yes, but that has been that way since the beginning, maybe @gjoseph92 knows more about it. |
I don't think that's too surprising, I think that one spills a bit? I'd expect it to get better after dask/distributed#7213 |
Yep. Spill is also why |
This could even be the same thing as |
Some tests that were decently stable have become very volatile around mid september. This makes it very hard to trust the regression detection system we have in place. But there is a pattern:
q3 [5GB parquet] - upstream
q5 [5GB parquet] - upstream - Hard to see but the red line became more volatile around the same time
q7 [5GB parquet] - upstream
q8 [5GB parquet] - upstream
test_dataframe_align - volatile and definitely a regression
test_shuffle - became more volatile
The interesting thing is that this volatility also happens on latest (2022.6.1) which makes me think this issue was introduced with something that happened in package sync (PR merged 09/15)
q3 [5GB parquet] latest
q7 [5GB parquet] latest
test_dataframe_align
cc: @ian-r-rose @jrbourbeau @shughes-uk
The text was updated successfully, but these errors were encountered: