Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable ppc64le #94

Merged
merged 7 commits into from
Dec 23, 2019
Merged

Enable ppc64le #94

merged 7 commits into from
Dec 23, 2019

Conversation

jaimergp
Copy link
Member

Checklist

  • Used a fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@jaimergp
Copy link
Member Author

@conda-forge-admin, please rerender

@jaimergp
Copy link
Member Author

PPC builds but one test fails. ARM is missing some dependencies...

@xylar
Copy link
Contributor

xylar commented Dec 20, 2019

The ppc mpich test that's failing is trying to run on 16 processors, which are definitely not available on CI:

 mpiexec -n 16 ./tst_parallel3

Maybe we can disable this particular test?

@beckermr
Copy link
Member

If it is using openmpi, you can force it to use 16 threads anyways. I forget the flag for that.

@xylar
Copy link
Contributor

xylar commented Dec 20, 2019

If it is using openmpi, you can force it to use 16 threads anyways. I forget the flag for that.

The flag is --oversubscribe, or the environment variable is OMPI_MCA_rmaps_base_oversubscribe=yes, already set here:
https://github.com/conda-forge/libnetcdf-feedstock/blob/master/recipe/build.sh#L7
and redundantly here:
https://github.com/conda-forge/libnetcdf-feedstock/blob/master/recipe/build.sh#L10

But the current problem is with mpich, for which I don't know of an equivalent flag or variable.

@beckermr
Copy link
Member

Ahhhh. Well you could patch the test to user fewer cores on ppc64le.

Comment on lines 9 to 14
-echo
-echo "Testing simple parallel I/O with 16 processors..."
-@MPIEXEC@ -n 16 ./tst_parallel3
+# echo
+# echo "Testing simple parallel I/O with 16 processors..."
+# @MPIEXEC@ -n 16 ./tst_parallel3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we good with disabling this test for all systems? I think so, since a 16-processor test isn't very meaningful on CI but just want to make sure.

Seems to have done the trick in mpich for ppc64le

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s just set the number of mpi ranks to 4 but run the test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That patch was only applied for ppc64le. I can also try with -n 4 if you want, but in any case it would be for ppc64le only.

@xylar
Copy link
Contributor

xylar commented Dec 22, 2019

@jaimergp, might it be worth separating ppc64le (now working) from aarch64 (missing dependencies) into separate PRs?

@jaimergp
Copy link
Member Author

@conda-forge-admin, please rerender

@jaimergp
Copy link
Member Author

Opendap.org was down, resulting in all builds erroring:

 <!DOCTYPE^ HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html><head><title>503 Service Temporarily Unavailable</title></head><body><h1>Service Temporarily Unavailable</h1><p>The server is temporarily unable to service yourrequest due to maintenance downtime or capacityproblems. Please try again later.</p><hr><address>Apache/2.2.15 (CentOS) Server at test.opendap.org Port 80</address></body></html>

Is there anyway to retrigger the CI?

@jaimergp jaimergp changed the title Enable ppc64le/aarch64 Enable ppc64le Dec 22, 2019
@beckermr
Copy link
Member

@conda-forge-admin, please restart ci

@xylar
Copy link
Contributor

xylar commented Dec 22, 2019

Ah, sorry, I missed that the patch was only for ppc64le. Fine with me either way.

@xylar
Copy link
Contributor

xylar commented Dec 22, 2019

opendap.org still seems to be down.

@jaimergp
Copy link
Member Author

Seems to be back online!

@conda-forge-admin, please restart ci

@jaimergp
Copy link
Member Author

Or... maybe not :/

@xylar
Copy link
Contributor

xylar commented Dec 23, 2019

@conda-forge-admin, please restart ci

@xylar
Copy link
Contributor

xylar commented Dec 23, 2019

Maybe now?

@xylar
Copy link
Contributor

xylar commented Dec 23, 2019

It looks like this has happened before:
Unidata/netcdf-c#1449
and there is an issue to stop using the server:
Unidata/netcdf-c#1451
I will indicate our difficulties and interest in having this happen sooner rather than later.

@kmuehlbauer
Copy link
Contributor

@xylar It might take some time to have this included in a new netcdf release. Should we patch our way for the time being?

@xylar
Copy link
Contributor

xylar commented Dec 23, 2019

I don't think that's a good idea. If there were a patch to make these tests work without test.opendap.org, I'd be all for it. But I don't think such a patch exists yet so I think we would need to disable all the tests that use test.opendap.org. That strikes me as risky but I'm not all that expert in the tests to know which might be considered redundant.

@xylar
Copy link
Contributor

xylar commented Dec 23, 2019

I just wrote to support at opendap.org to see if they can tell us when the server is expected to be back up.

@kmuehlbauer
Copy link
Contributor

You're right, I had the impression there was such a fix already available.

@xylar
Copy link
Contributor

xylar commented Dec 23, 2019

From OpenDAP:

Should be up now.

Please don't hesitate to let us know if you find it down in the future.

@conda-forge-admin, please restart ci

@kmuehlbauer
Copy link
Contributor

@xylar Seems that the new archs are not picked up on azure.

@xylar
Copy link
Contributor

xylar commented Dec 23, 2019

I believe that’s to be expected. Anyone else know?

@xylar
Copy link
Contributor

xylar commented Dec 23, 2019

If no objections, I’ll merge this later this evening.

@xylar
Copy link
Contributor

xylar commented Dec 23, 2019

@jaimergp, do you want to add yourself as a maintainer? (Could also be in the next PR.)

@beckermr
Copy link
Member

Defaults providers for ppc64le and aarch64 are Travis and drone respectively. They should not be on azure by default. Azure doesn’t have native builds for these and the emulated ones are super slow.

LGMT on the PR!

@kmuehlbauer
Copy link
Contributor

@beckermr Thanks for the explanation. Feeling a bit silly now, 😀

@jaimergp
Copy link
Member Author

I'll add myself as a maintainer in the next PR (arm). I don't have access to a PC now, so let's try to get this merged first!

@ocefpaf ocefpaf merged commit 3fb09b1 into conda-forge:master Dec 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants