Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GH action for running tests against upstream dev #4583

Merged
merged 12 commits into from
Nov 22, 2020

Conversation

andersy005
Copy link
Member

@andersy005 andersy005 commented Nov 13, 2020

This GitHub action will run tests against the upstream dependencies. When the tests fail, the GH action will create an issue (with "CI" label). The body of the created issue includes:

  • test summary report
  • a link to the failed workflow run

Here's a couple screenshots from a toy repository:

Screen Shot 2020-11-13 at 4 12 03 PM

Screen Shot 2020-11-13 at 4 12 13 PM

  • Closes nightly upstream test #4574
  • Tests added
  • Passes isort . && black . && mypy . && flake8
  • User visible changes (including notable bug fixes) are documented in whats-new.rst
  • New functions/methods are listed in api.rst

@keewis
Copy link
Collaborator

keewis commented Nov 14, 2020

thanks, @andersy005, this looks great.

For reference, this is how we currently install the upstream-dev dependencies:

# TODO: add sparse back in, once Numba works with the development version of
# NumPy again: https://github.com/pydata/xarray/issues/4146
- bash: |
source activate xarray-tests
conda uninstall -y --force \
numpy \
scipy \
pandas \
matplotlib \
dask \
distributed \
zarr \
cftime \
rasterio \
pint \
bottleneck \
sparse
python -m pip install \
-i https://pypi.anaconda.org/scipy-wheels-nightly/simple \
--no-deps \
--pre \
--upgrade \
numpy \
scipy \
pandas
python -m pip install \
-f https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com \
--no-deps \
--pre \
--upgrade \
matplotlib
python -m pip install \
--no-deps \
--upgrade \
git+https://github.com/dask/dask \
git+https://github.com/dask/distributed \
git+https://github.com/zarr-developers/zarr \
git+https://github.com/Unidata/cftime \
git+https://github.com/mapbox/rasterio \
git+https://github.com/hgrecco/pint \
git+https://github.com/pydata/bottleneck

Installing matplotlib doesn't work right now, though (see #4256).

@max-sixty
Copy link
Collaborator

This is awesome! Thanks for kicking it off @andersy005

@andersy005
Copy link
Member Author

Of course, @keewis & @max-sixty!

The latest commits include updates to the issue body and adds a build matrix for Python 3.7+. The issue body will look like:

Screen Shot 2020-11-14 at 10 16 13 PM

For reference, this is how we currently install the upstream-dev dependencies:

Thank you for the pointer, @keewis! I added PyPI nightly wheels for scipy, numpy, pandas... What should do about the matploblib wheels issue?

@keewis
Copy link
Collaborator

keewis commented Nov 15, 2020

What should do about the matploblib wheels issue?

since it's broken right now I would keep the version from conda-forge until the nightly wheels are available again

Comment on lines 92 to 98
github.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
body: issue_body,
title: title,
labels: ['CI']
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something (especially since I don't know much about github script and action artifacts), but this looks like it will open a new issue on every run while the fix is being worked on. Is there a way to avoid that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the same. One fairly manual resolution to this is to merge something xfail the test, and then a PR to un-xfail the test.

Otherwise it would have to search the existing issues — maybe possible though probably difficult?

Copy link
Member Author

@andersy005 andersy005 Nov 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something (especially since I don't know much about github script and action artifacts), but this looks like it will open a new issue on every run while the fix is being worked on. Is there a way to avoid that?

An issue will be created only when the following condition is met

if: "always()&&(github.event_name == 'schedule')&&(github.repository == 'pydata/xarray')"

i.e., the workflow run must have been triggered via a cron schedule, and this trigger should be active on the pydata/xarray repository (which allows us to avoid opening issues on other people's folks). Currently, this condition has some limitations though. For instance, a new issue will be created every day as long as the upstream-dev CI is not passing.

Otherwise it would have to search the existing issues — maybe possible though probably difficult?

I'm investigating whether there's a straightforward way to keep track of the latest issue created via the GitHub action, and adding comments to an existing issue instead of creating a new issue.

Copy link
Collaborator

@keewis keewis Nov 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, a new issue will be created every day as long as the upstream-dev CI is not passing.

Exactly, that's what I meant. Sorry for being ambiguous.

I'm investigating whether there's a straightforward way to keep track of the latest issue created via the GitHub action, and adding comments to an existing issue instead of creating a new issue.

That would be cool, but I would like to make sure we don't post a comment every day, even if the failures don't change. Luckily, pytest prints the short test summary info section, so we should be able to extract the test ids and do a set comparison with the last comment (or issue, should we decide we need to open a new issue if different tests fail).

Actually, you could upload a file containing the extracted test ids as a artifact and use that to check whether the failed tests changed in the next run. Depending on how we compare, we might even be able to avoid opening a new issue for the remaining failures after merging a partial fix.

Copy link
Collaborator

@alexamici alexamici Nov 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would skip opening an issue automatically in the initial PR. I would prefer people to monitor the workflow and report issue in a non automatic way.

If we see that this build failures get overlooked, then we add the automatic reporting.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think people are going to proactively check the overnight runs? Great if so, and if there's a realistic possibility, fine to try it.

.github/workflows/upstream-dev-ci.yaml Outdated Show resolved Hide resolved
Comment on lines +88 to +129
// Run GraphQL query against GitHub API to find the most recent open issue used for reporting failures
const query = `query($owner:String!, $name:String!, $creator:String!, $label:String!){
repository(owner: $owner, name: $name) {
issues(first: 1, states: OPEN, filterBy: {createdBy: $creator, labels: [$label]}, orderBy: {field: CREATED_AT, direction: DESC}) {
edges {
node {
body
id
number
}
}
}
}
}`;

const variables = {
owner: context.repo.owner,
name: context.repo.repo,
label: 'CI',
creator: "github-actions[bot]"
}
const result = await github.graphql(query, variables)
const issue_info = result.repository.issues.edges[0].node

// If no issue is open, create a new issue, else update the
// body of the existing issue.
if (typeof issue_info.number === 'undefined') {
github.issues.create({
owner: variables.owner,
repo: variables.name,
body: issue_body,
title: title,
labels: [variables.label]
})
} else {
github.issues.update({
owner: variables.owner,
repo: variables.name,
issue_number: issue_info.number,
body: issue_body
})
}
Copy link
Member Author

@andersy005 andersy005 Nov 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confident this gives us what we want in terms of reporting failures:, i.e.:

  1. Find the most recently created issue (used for reporting failures)
  2. If there's an existing and open issue, update the body of the issue with the most recent summary failures..
  3. Else if there's no candidate, open issue, create a new issue and post the summary information of the failing tests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very impressive. I think this is the best of all worlds.

@andersy005 andersy005 marked this pull request as ready for review November 21, 2020 07:39
Copy link
Collaborator

@keewis keewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, @andersy005, this looks good to me.

Updating the issue body is a nice idea to avoid getting notifications. I would have implemented a different behavior to make sure different failures are reported as different issues (as described in #4583 (comment)), but I think we can just try out how much of an issue the merging of all upstream-dev failures into a single issue would be in practice.

@max-sixty
Copy link
Collaborator

@andersy005 do you want to add to whatsnew?

Either way we can merge later

@max-sixty max-sixty merged commit 6daad06 into pydata:master Nov 22, 2020
@andersy005
Copy link
Member Author

Thank you to everyone who provided feedback/review....

FYI, the GitHub action failed to run after the merge: https://github.com/pydata/xarray/actions/runs/376810413

The logs show this error message:

...and actions/github-script@v3 are not allowed to be used in pydata/xarray. Actions in this workflow must be: within a repository owned by pydata.

It looks like the admins of the "pydata/xarray" repo have set the policy to be “Allow local actions only”. This option only allows the actions defined in the repositories within the same organization “pydata” to be used. Can someone with admin privileges enable the “Allow all actions” option from the repo settings: https://github.com/pydata/xarray/settings/actions?

@andersy005 andersy005 deleted the upstream-dev-ci branch November 22, 2020 04:28
@jhamman
Copy link
Member

jhamman commented Nov 22, 2020

@andersy005 - done! I manually triggered the CI and it seems to be working now: https://github.com/pydata/xarray/runs/1437420208?check_suite_focus=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nightly upstream test
6 participants