Surface failing tests on GHA #7364

pmeier · 2023-02-28T19:03:33Z

Running a large test suite like torchvision results in really sluggish behavior of GitHub Actions. To counteract this somewhat, we recently merged #7267 although that meant we needed to give up the -v flag for the pytest invocation. As explained in 2. of #7267 (comment), this is not a big deal on CircleCI due to its ability to surface failing tests natively in a separate tab. GHA does not have this ability builtin, but I have written a small action that makes it possible: https://github.com/pmeier/pytest-results-action/

@NicolasHug, @vfdev-5

I've failed a few tests for demonstration purposes. To view them

Click on the failing job: https://github.com/pytorch/vision/actions/runs/4305621879/jobs/7508327462
Click on "Summary" at the top of the left sidebar: https://github.com/pytorch/vision/actions/runs/4305621879
Scroll down until you reach the box for the workflow you are interested in (will only be visible after the workflow is complete): https://github.com/pytorch/vision/actions/runs/4305621879#summary-11691394136

I've reinstated the -v flag for a realistic experience. Could you try it and report back if this is good enough? IMO this is the best UX that we can get out of GHA. If we want more, we need other custom solutions.

@osalpekar @DanilBaibak

I've used this patch to upload the results: https://github.com/pytorch/test-infra/compare/3660cdcfa532b664dd3afdd6b25809725c065da2..c2ab5de5211b4eea97dd661a26c1f938645ee2c6. If we want to go that route, we should probably add a test-results parameter and only act if that is set.

cc @seemethere

NicolasHug · 2023-03-02T10:09:17Z

I'm sure I'll have more specific feedback in the future after using this for a bit, but even in its current form the new summary is extremely useful. Thanks a lot @pmeier for working on it.

pmeier · 2023-03-08T21:29:14Z

In addition to running pytest -v, this also enables us to run bash -x, i.e. in verbose mode. See #7396 (comment).

pytorch-bot · 2023-03-13T10:34:06Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7364

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@DanilBaibak

Compared to CircleCI, GitHub Actions has one major DevX downside: it is really hard to find failing tests in the logs. This happens for two reasons: 1. GitHub Actions seems to be really slow to load logs in general. We are often waiting multiple tens of seconds, with the browser already asking if we want to stop loading the side completely, because it seems to have hanged. 2. GitHub Actions loads the lines from the beginning, while the failing tests are summarized at the end. Meaning, we spent quite some time to load some stuff that we don't want to look at. CircleCI does this better: 1. In case the log exceeds 400k characters, the online view will be truncated to the *last* 400k characters, which comes down to roughly 3k lines. Paired with a lot faster loading in general, we can see the traceback of failing tests on CircleCI before GitHub Actions even loaded the logs. 2. CircleCI has the ability to surface failing tests in general so one doesn't even have to take a look at the logs in 99% of the cases. Since we can't do anything about 1. here, this PR goes after 2. to bring some of the DevX from CircleCI to GitHub Actions. This ability of CircleCI to surface failing tests comes from this builtin functionality: https://circleci.com/docs/collect-test-data/ This works by analyzing a JUnit XML file. This file can be generated by `pytest`, which is what the domain libraries are using to run their test suites. Copying from the [CircleCI documentation](https://circleci.com/docs/collect-test-data/#pytest), this is minimal example: ```yaml - run: name: run tests command: | . venv/bin/activate mkdir test-results pytest --junitxml=test-results/junit.xml - store_test_results: path: test-results ``` This PR introduces a small custom GitHub action that does the same as the `store_test_results` from CircleCI: [`pytest-results-action`](https://github.com/pmeier/pytest-results-action). All it does, is to parse the same JUnit XML file and store the results as part of the [job summary](https://github.blog/2022-05-09-supercharging-github-actions-with-job-summaries/). The usage is the same: ```yaml - name: Run tests run: pytest --junit-xml=test-results.xml - name: Summarize test results if: always() uses: pmeier/pytest-summary-gha@v0.2.0 with: junit-xml: test-results.xml ``` The branch of this PR was used in pytorch/vision#7364 and "stamped" by the `torchvision` maintainers. Offline, other domain library maintainers also expressed their support for this. Another option currently explored by @DanilBaibak is to bring the full PyTorch HUD including the bot to the domain libraries. This seems like a larger task that might take some time. In the meantime, this PR could drastically improve DevX without wait time.

pmeier · 2023-03-13T22:31:01Z

pytorch/test-infra#3841 was merged and thus we can fully depend on this behavior now. 312f4dc will do a final showcase of the functionality. Afterwards, I'll revert caf522a and thus only real failures will remain.

This reverts commit caf522a.

NicolasHug · 2023-03-14T10:01:46Z

Before stamping, who is in charge of maintaining pmeier/pytest-results-action@v0.3.0?
Will this be the nova team, or just you?

pmeier · 2023-03-14T10:17:02Z

Right now, it is just me. We didn't have any discussions around that. There are similar instances all over the test-infra repository, e.g. here.

@DanilBaibak is working on surfacing failing tests through the HUD as well. Right now it only shows you the failing job with the log, but this will be improved in the near future. Meaning, maintainers have two options to get the information and thus if for some reason I no longer can or want to maintain pmeier/pytest-results-action, we can just rip it out. Of course forking is always an option.

NicolasHug

Right now, it is just me. We didn't have any discussions around that.

I'll stamp to unblock. However, I want to be clear upfront that while this is a much needed feature, this isn't something the torchvision engineers can maintain.

github-actions · 2023-03-14T14:44:56Z

Hey @pmeier!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

Reviewed By: vmoens Differential Revision: D44416567 fbshipit-source-id: b8101bee132ebbdc2313eae2030e3f82509d9eb9

pmeier added 2 commits February 28, 2023 19:59

try upload test results

db0593d

[REVERTME] introduce some failures

caf522a

facebook-github-bot added the cla signed label Feb 28, 2023

pmeier added 8 commits March 1, 2023 09:17

name upload artifact

0f8d554

trigger CI

6f5b112

return 0 even with failing tests

82e341f

try upload

07e5401

trigger CI

63f186d

use origin repo rather than fork

1f1df25

trigger CI

e25ba3e

run full test suite

1637ed5

pmeier changed the title ~~Surface failing tests on GHA~~ [PoC] Surface failing tests on GHA Mar 1, 2023

Merge branch 'main' into gha-failures

f29ad4c

pmeier mentioned this pull request Mar 1, 2023

add macOS unittest to GHA #7376

Merged

Merge branch 'main' into gha-failures

df84391

This was referenced Mar 6, 2023

Support surfacing failing tests on generic jobs pytorch/test-infra#3841

Merged

port special tests from CircleCI to GHA #7396

Merged

Merge branch 'main' into gha-failures

6f8101e

pmeier added 4 commits March 13, 2023 11:35

fix linux workflows

a44008e

also surface failing special tests

b298df5

fix refs

8dbfcb1

trigger CI

e217e0d

use main branch from test-infra

312f4dc

pmeier marked this pull request as ready for review March 13, 2023 22:31

pmeier requested a review from NicolasHug March 13, 2023 22:31

pmeier requested a review from vfdev-5 March 13, 2023 22:31

Revert "[REVERTME] introduce some failures"

cc11bde

This reverts commit caf522a.

pmeier changed the title ~~[PoC] Surface failing tests on GHA~~ Surface failing tests on GHA Mar 14, 2023

NicolasHug approved these changes Mar 14, 2023

View reviewed changes

pmeier merged commit d2eaeb8 into pytorch:main Mar 14, 2023

pmeier deleted the gha-failures branch March 14, 2023 14:44

pmeier mentioned this pull request Mar 14, 2023

also surface failing tests in prototype jobs #7416

Merged

pmeier added the module: ci label Mar 14, 2023

facebook-github-bot pushed a commit that referenced this pull request Mar 30, 2023

[fbsync] Surface failing tests on GHA (#7364)

b7b6b5c

Reviewed By: vmoens Differential Revision: D44416567 fbshipit-source-id: b8101bee132ebbdc2313eae2030e3f82509d9eb9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Surface failing tests on GHA #7364

Surface failing tests on GHA #7364

Uh oh!

pmeier commented Feb 28, 2023 •

edited by pytorch-bot bot

Loading

Uh oh!

NicolasHug commented Mar 2, 2023

Uh oh!

pmeier commented Mar 8, 2023

Uh oh!

pytorch-bot bot commented Mar 13, 2023

Uh oh!

pmeier commented Mar 13, 2023

Uh oh!

NicolasHug commented Mar 14, 2023

Uh oh!

pmeier commented Mar 14, 2023

Uh oh!

NicolasHug left a comment

Uh oh!

github-actions bot commented Mar 14, 2023

Uh oh!

Uh oh!

Surface failing tests on GHA #7364

Surface failing tests on GHA #7364

Uh oh!

Conversation

pmeier commented Feb 28, 2023 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug commented Mar 2, 2023

Uh oh!

pmeier commented Mar 8, 2023

Uh oh!

pytorch-bot bot commented Mar 13, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7364

Uh oh!

pmeier commented Mar 13, 2023

Uh oh!

NicolasHug commented Mar 14, 2023

Uh oh!

pmeier commented Mar 14, 2023

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 14, 2023

Uh oh!

Uh oh!

pmeier commented Feb 28, 2023 •

edited by pytorch-bot bot

Loading