-
Notifications
You must be signed in to change notification settings - Fork 7.1k
test_quantized_models.py times out within Nightly CI #1857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report! I think this might be due to @zou3519 changes in pytorch/pytorch#32495, which also caused nested tensors CI to fail, see pytorch/pytorch#33091 |
@fmassa does test_ops.py run c++ extensions? |
I don't think pytorch/pytorch#32495 is related because EDIT: Here's another reason why I think pytorch/pytorch#32495 is unrelated. Consider #1850. The log for the first failing job, if we download it completely, doesn't say it is using ninja to build, which means that the CI ran and failed before the change in pytorch/pytorch#32495 was committed. EDIT2: Here is the test PR where we build without ninja. This one also fails in the same way as reported in this issue, so ninja is probably unrelated. |
Do we have any other leads on why this is happening? Perhaps adding some debugging information to the |
After submitting #1880 I now believe the culprit we should be looking at is something that runs after From these logs:
|
Is this reproducible locally? |
I think it’s highly dependent on performance. If the test takes longer than 10 min to run then it’ll time out like this. Were there any tests within that file that were changed recently? |
If the issue is in |
From my local testing the test that takes the longest is Got this result from running on my branch that outputs junit logs CU_VERSION=cpu PYTHON_VERSION=3.8 packaging/build_conda.sh Results are here: results.zip This would lead me to believe that this test is the culprit. I've submitted #1885 to skip the test for now and get the nightly build pipeline back on track. |
Assigning this to @raghuramank100 , who leads the quantization efforts |
This should be fixed by #3196. |
Currently the nightly pipelines are failing when running
pytest .
on thetest/test_ops.py
tests.Example CircleCI Logs:
https://app.circleci.com/jobs/github/pytorch/vision/85108
Log Excerpt:
This is what is currently effecting pytorch/pytorch#33103
Also this may be related to #1528
cc @fmassa
The text was updated successfully, but these errors were encountered: