Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP · ppc64le support #16

Closed
wants to merge 21 commits into from

Conversation

jaimergp
Copy link
Member

@jaimergp jaimergp commented Dec 3, 2019

Checklist

  • Used a fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged) -- building new architectures only
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

This will resolve #8

Blocking issues

  • ocl-icd-system is not yet ready for ppc64le and aarch64. PR here.
  • doxygen 1.8.14 is not available for ppc64le and aarch64. There is 1.8.16, but we need 1.8.14 for OpenMM. Submitted a PR here.
  • lxml cannot be found in aarch64. ppc64le is indeed available for ppc64le, but only through defaults. Submitted PR to conda-forge here.

Once dependencies are solved, keep an eye on issues like this one:

import: 'simtk'
import: 'simtk.openmm'
+ python -m simtk.testInstallation
/home/conda/feedstock_root/build_artifacts/openmm-cpu_1566489851175/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.6/site-packages/numpy/core/getlimits.py:53: RuntimeWarning: divide by zero encountered in log10
  self.precision = int(-log10(self.eps))
/home/conda/feedstock_root/build_artifacts/openmm-cpu_1566489851175/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.6/site-packages/numpy/core/getlimits.py:53: RuntimeWarning: overflow encountered in log10
  self.precision = int(-log10(self.eps))

OpenMM Version: 7.4
Git Revision: Unknown

There are 2 Platforms available:

1 Reference - Successfully computed forces
2 CPU - Successfully computed forces

Median difference in forces between platforms:

Problem with OpenMM installation encountered. OpenMM will not work until the problem has been fixed.


Error message: 0.0 cannot be raised to a negative power

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@jaimergp
Copy link
Member Author

jaimergp commented Dec 3, 2019

@conda-forge-admin, please rerender

@jaimergp
Copy link
Member Author

jaimergp commented Dec 6, 2019

All needed dependencies are now available! However, builds for PowerPC and ARM are failing for different reasons (although apparently related to the same thing?).

ppc64le fails here (this error appears several times). We might be able to fix it by adding the suggested flag (-DNO_WARN_X86_INTRINSICS).

/home/conda/feedstock_root/build_artifacts/openmm_1575605736402/work/platforms/opencl/src/OpenCLArray.cpp:27:
/home/conda/feedstock_root/build_artifacts/openmm_1575605736402/_build_env/lib/gcc/powerpc64le-conda_cos7-linux-gnu/8.2.0/include/emmintrin.h:56:2: error: #error "Please read comment above.  Use -DNO_WARN_X86_INTRINSICS to disable this error."
 #error "Please read comment above.  Use -DNO_WARN_X86_INTRINSICS to disable this error."
  ^~~~~

aarch64 fails here (line 2599; this error appears several times too):

/drone/src/build_artifacts/openmm_1575605706819/work/platforms/cpu/src/CpuPlatform.cpp:34:
--
2599 | /drone/src/build_artifacts/openmm_1575605706819/work/openmmapi/include/openmm/internal/vectorize_sse.h:35:10: fatal error: smmintrin.h: No such file or directory
2600 | #include <smmintrin.h>
2601 | ^~~~~~~~~~~~~
2602 | compilation terminated.

Is there anything special about intrinsics in PowerPC/ARM when it comes to OpenMM, @peastman? Why does the compiler does not include smmintrin.h and emmintrin.h in ARM, @isuruf?

--
[edit]
Nevermind, Isuru; I just realized that these intrinsics are architecture specific.

@jaimergp
Copy link
Member Author

jaimergp commented Dec 6, 2019

PowerPC

I got ppc64le to build by adding the DNO_WARN_X86_INTRINSICS flag! I am not seeing the precision issues reported in previous attempts, but the Reference vs CPU diffs are too large:

$ python -m simtk.testInstallation
OpenMM Version: 7.4.1
Git Revision: 7ad4af319497da5ce4b3040e8b03d5860d4082a2
There are 2 Platforms available:
1 Reference - Successfully computed forces
2 CPU - Successfully computed forces
Median difference in forces between platforms:
Reference vs. CPU: 0.194358  *** LARGE DIFFERENCE **

Is there any other workaround for this issue?

Notice that OpenCL is built too, but we cannot test it because pocl is not available for these architectures.

ARM

These don't work yet.

For some reason, the compiler tries to use the x86 intrinsics, although I can see mentions to Neon instructions in the CMake configurations in OpenMM. Is there any extra flag I need to define to disable these, @peastman?

@peastman
Copy link
Contributor

peastman commented Dec 6, 2019

We have different versions of the vectorization code for x86, PPC, and ARM. It's supposed to automatically pick the correct one based on the architecture you're compiling for, but it looks like something is going wrong with that. Here's the section of the CMake script that does it:

https://github.com/openmm/openmm/blob/2ca749a164af6023653b3044b21056a7dd2b33a6/CMakeLists.txt#L36-L49

We then include the correct header based on which of those symbols was defined:

https://github.com/openmm/openmm/blob/master/openmmapi/include/openmm/internal/vectorize.h

@jaimergp
Copy link
Member Author

jaimergp commented Dec 7, 2019

Thanks for the tip! I defined some variables to force the architecture detection, and that reduced the number of errors. However there are still some:

  • PPC has the same errors, but it only comes from the OpenCL platform.
  • ARM seems to be related to the C wrapper? It's only one error in this case.

@peastman
Copy link
Contributor

peastman commented Dec 7, 2019

I see the problem on ARM. It's this line: https://github.com/openmm/openmm/blob/master/libraries/vecmath/src/vecmath.cpp#L1. It ought to be #if defined(__ARM__), not #if defined(__ANDROID__).

I'm a little confused about the PPC error. cl.hpp contains these lines:

#if defined(linux) || defined(__APPLE__) || defined(__MACOSX)
#include <alloca.h>

#include <emmintrin.h>
#include <xmmintrin.h>
#endif // linux

That seems to assume Linux will always be x86. But we've successfully compiled on Linux PPC before. So what is different this time, and why did those lines not cause problems before?

@jaimergp
Copy link
Member Author

jaimergp commented Dec 8, 2019

Re PPC, maybe we were using a different compiler (older GCC, clang, etc) where the intrinsics can be used without the error? It's actually a warning turned error because we are not defining NO_WARN_X86_INTRINSICS.

@jaimergp
Copy link
Member Author

jaimergp commented Dec 8, 2019

I re-added NO_WARN_X86_INTRINSICS to fix the OpenCL errors in PPC, but the large differences in Reference vs CPU are still there. I thought it would be caused by the intrinsics thing, but apparently it's not.

Regarding ARM, patching ANDROID to ARM in that line allowed the compilation to continue, uncovering the same OpenCL error later in the build. In this arch, gcc does not provide the intrinsics headers, so we can't use NO_WARN_X86_INTRINSICS. I guess there is not easy patch to apply in this case, right?

There's also this other error:

/drone/src/build_artifacts/openmm_1575800787904/work/openmmapi/include/openmm/internal/vectorize_neon.h:35:10: fatal error: cpu-features.h: No such file or directory

Is that Android specific?

@peastman
Copy link
Contributor

peastman commented Dec 8, 2019

For the PPC intrinsics error, possibly we just need to upgrade to the latest version of cl.hpp. I see it no longer imports the Intel intrinsics. https://github.com/KhronosGroup/OpenCL-CLHPP/blob/master/include/CL/cl.hpp

Is that Android specific?

Yes, you're right. It's being used in this code:

static bool isVec4Supported() {
    uint64_t features = android_getCpuFeatures();
    return (features & ANDROID_CPU_ARM_FEATURE_NEON) != 0;
}

As you can tell, Android is the only platform where we've ever compiled for ARM before. We need to figure out how to replace that with something that isn't Android specific.

@jaimergp
Copy link
Member Author

@conda-forge-admin, please rerender

@jaimergp jaimergp changed the title WIP · ppc64le & aarch64 support WIP · ppc64le support Jan 14, 2020
@jaimergp
Copy link
Member Author

Disabling ARM for now. I'll open a PR for that one once the errors mentioned above have been fixed.

@jaimergp
Copy link
Member Author

jaimergp commented Jan 14, 2020

The cl.hpp update did remove the warnings about the intrinsics, but we are are still observing large differences in forces.

@jaimergp
Copy link
Member Author

jaimergp commented Jun 2, 2020

Closing this one. Work will continue on #23

@jaimergp jaimergp closed this Jun 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PowerPC builds
4 participants