Skip to content

Surface failing tests on GHA #7364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Mar 14, 2023
Merged

Surface failing tests on GHA #7364

merged 19 commits into from
Mar 14, 2023

Conversation

pmeier
Copy link
Collaborator

@pmeier pmeier commented Feb 28, 2023

Running a large test suite like torchvision results in really sluggish behavior of GitHub Actions. To counteract this somewhat, we recently merged #7267 although that meant we needed to give up the -v flag for the pytest invocation. As explained in 2. of #7267 (comment), this is not a big deal on CircleCI due to its ability to surface failing tests natively in a separate tab. GHA does not have this ability builtin, but I have written a small action that makes it possible: https://github.com/pmeier/pytest-results-action/

@NicolasHug, @vfdev-5

I've failed a few tests for demonstration purposes. To view them

I've reinstated the -v flag for a realistic experience. Could you try it and report back if this is good enough? IMO this is the best UX that we can get out of GHA. If we want more, we need other custom solutions.

@osalpekar @DanilBaibak

I've used this patch to upload the results: https://github.com/pytorch/test-infra/compare/3660cdcfa532b664dd3afdd6b25809725c065da2..c2ab5de5211b4eea97dd661a26c1f938645ee2c6. If we want to go that route, we should probably add a test-results parameter and only act if that is set.

cc @seemethere

@pmeier pmeier changed the title Surface failing tests on GHA [PoC] Surface failing tests on GHA Mar 1, 2023
@pmeier pmeier mentioned this pull request Mar 1, 2023
@NicolasHug
Copy link
Member

I'm sure I'll have more specific feedback in the future after using this for a bit, but even in its current form the new summary is extremely useful. Thanks a lot @pmeier for working on it.

@pmeier
Copy link
Collaborator Author

pmeier commented Mar 8, 2023

In addition to running pytest -v, this also enables us to run bash -x, i.e. in verbose mode. See #7396 (comment).

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 13, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7364

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

seemethere pushed a commit to pytorch/test-infra that referenced this pull request Mar 13, 2023
Compared to CircleCI, GitHub Actions has one major DevX downside: it is
really hard to find failing tests in the logs. This happens for two
reasons:

1. GitHub Actions seems to be really slow to load logs in general. We
are often waiting multiple tens of seconds, with the browser already
asking if we want to stop loading the side completely, because it seems
to have hanged.
2. GitHub Actions loads the lines from the beginning, while the failing
tests are summarized at the end. Meaning, we spent quite some time to
load some stuff that we don't want to look at.

CircleCI does this better:

1. In case the log exceeds 400k characters, the online view will be
truncated to the *last* 400k characters, which comes down to roughly 3k
lines. Paired with a lot faster loading in general, we can see the
traceback of failing tests on CircleCI before GitHub Actions even loaded
the logs.
2. CircleCI has the ability to surface failing tests in general so one
doesn't even have to take a look at the logs in 99% of the cases.

Since we can't do anything about 1. here, this PR goes after 2. to bring
some of the DevX from CircleCI to GitHub Actions.

This ability of CircleCI to surface failing tests comes from this
builtin functionality: https://circleci.com/docs/collect-test-data/

This works by analyzing a JUnit XML file. This file can be generated by
`pytest`, which is what the domain libraries are using to run their test
suites. Copying from the [CircleCI
documentation](https://circleci.com/docs/collect-test-data/#pytest),
this is minimal example:

```yaml
      - run:
          name: run tests
          command: |
            . venv/bin/activate
            mkdir test-results
            pytest --junitxml=test-results/junit.xml

      - store_test_results:
          path: test-results
```

This PR introduces a small custom GitHub action that does the same as
the `store_test_results` from CircleCI:
[`pytest-results-action`](https://github.com/pmeier/pytest-results-action).
All it does, is to parse the same JUnit XML file and store the results
as part of the [job
summary](https://github.blog/2022-05-09-supercharging-github-actions-with-job-summaries/).
The usage is the same:

```yaml
- name: Run tests
  run: pytest --junit-xml=test-results.xml

- name: Summarize test results
  if: always()
  uses: pmeier/pytest-summary-gha@v0.2.0
  with:
    junit-xml: test-results.xml
```

The branch of this PR was used in pytorch/vision#7364 and "stamped" by
the `torchvision` maintainers. Offline, other domain library maintainers
also expressed their support for this.

Another option currently explored by @DanilBaibak is to bring the full
PyTorch HUD including the bot to the domain libraries. This seems like a
larger task that might take some time. In the meantime, this PR could
drastically improve DevX without wait time.
@pmeier
Copy link
Collaborator Author

pmeier commented Mar 13, 2023

pytorch/test-infra#3841 was merged and thus we can fully depend on this behavior now. 312f4dc will do a final showcase of the functionality. Afterwards, I'll revert caf522a and thus only real failures will remain.

@pmeier pmeier marked this pull request as ready for review March 13, 2023 22:31
@pmeier pmeier requested a review from NicolasHug March 13, 2023 22:31
@pmeier pmeier requested a review from vfdev-5 March 13, 2023 22:31
@pmeier pmeier changed the title [PoC] Surface failing tests on GHA Surface failing tests on GHA Mar 14, 2023
@NicolasHug
Copy link
Member

Before stamping, who is in charge of maintaining pmeier/pytest-results-action@v0.3.0?
Will this be the nova team, or just you?

@pmeier
Copy link
Collaborator Author

pmeier commented Mar 14, 2023

Right now, it is just me. We didn't have any discussions around that. There are similar instances all over the test-infra repository, e.g. here.

@DanilBaibak is working on surfacing failing tests through the HUD as well. Right now it only shows you the failing job with the log, but this will be improved in the near future. Meaning, maintainers have two options to get the information and thus if for some reason I no longer can or want to maintain pmeier/pytest-results-action, we can just rip it out. Of course forking is always an option.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, it is just me. We didn't have any discussions around that.

I'll stamp to unblock. However, I want to be clear upfront that while this is a much needed feature, this isn't something the torchvision engineers can maintain.

@pmeier pmeier merged commit d2eaeb8 into pytorch:main Mar 14, 2023
@pmeier pmeier deleted the gha-failures branch March 14, 2023 14:44
@github-actions
Copy link

Hey @pmeier!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

facebook-github-bot pushed a commit that referenced this pull request Mar 30, 2023
Reviewed By: vmoens

Differential Revision: D44416567

fbshipit-source-id: b8101bee132ebbdc2313eae2030e3f82509d9eb9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants