-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some tests fails on i586, ppc64, ppc64le and 390x #2258
Comments
These errors are actually on ppc64le, and I have no idea what is happening. The errors on ppc64 and s390x (I have no idea who would run Espresso on an IBM mainframe, but sure, we can have binary packages for it...) are more obvious though: there the periodicity check does not work, probably because these are big-endian architectures and we are doing some incorrect bitwise operations. |
Not sure, I fixed it now... |
What did you change? Also, please re-open this issue as the periodicity issue on the big-endian architectures still exists. |
I think, it is a parallel test issue,
|
On ppc64:
and
|
Same on 390x |
There is still an issue on Tumbleweed i586:
|
Ok, #2259 fixed big endian! |
SLE_12_SP4: needs to be switched from openmpi2 to openmpi, @junghans. armv7l:
i586:
|
SLE_12_SP4 has another problem:
|
@mkuron any idea about that i586 issue? |
Patch in #2265. That only leaves us with the segfault on 32-bit ARM. |
I guess, you need to run qemu again ;-) |
Hmm, on i586 the serial tests take a very long time. |
I think the i586 build machine just got stuck. Before you uploaded my patches, the tests ran just fine, and the patches shouldn't slow down any tests. They also still run fine in my Docker container. Regarding ARM: I couldn't reproduce the analyze_energy failure. Trying to reproduce it was a nightmare though: the QEMU emulation is missing a syscall that OpenMPI 2 uses, so I first had to work around that (openpmix/openpmix#836, open-mpi/ompi#5716). OpenSUSE doesn't provide an arm32v7 Docker image anymore, so I used Ubuntu instead, where the issue does not occur, even when using the same GCC version and the same compiler flags. Then I created my own OpenSUSE arm32v7 Docker image, where I couldn't reproduce the issue either. I even tried So unless you can get us direct access to these build machines, we won't be able to fix whatever issue this is. It doesn't occur in "regular" builds on the respective architecture. |
Maybe @kkaempf knows how to do that! |
Reach out to opensuse-buildservice@opensuse.org, they should be able to help. |
The ppc64le issue is back:
and
|
Hmm, can we track which buildhosts are affected ? That might be a hardware/architecture/cpu-type problem !? |
Looking at https://build.opensuse.org/packages/python-espressomd/job_history/home:cjunghans:branches:devel:languages:python/openSUSE_Factory_PowerPC/ppc64le, there is no pattern. The job can succeed one day and fail on the same host the next day. These issues are also not reproducible in QEMU emulation, which is too bad as we only have x86_64 hardware on site. This one is also interesting:
|
Are you saying the failures (like the array mismatch) are ppc64le specific ?
Naa, that's just an "out of memory". This can be avoided by adding a |
Yes, this specific set of errors only appears on ppc64le. Some tests have rather tight tolerances that might not be valid on different hardware implementations of floating-point arithmetic, but the deviations appearing here are a bit big to blame on that. |
Can you give me a minimal test case ? I have a (large) |
While I can't provide a minimal test case as I don't know what is specifically causing the issue, here is how I built and ran the failing tests manually:
|
@kkaempf on i586 it gets stuck reproducible at:
|
What's the state of this? |
It's still broken and I can't debug it due to lack of hardware. It does not happen in QEMU-emulated Docker images of the respective architecture. I was unable to set up the official openSUSE QEMU environment, so I can't tell whether it occurs there. So whatever bug this is, it won't get fixed. I do want to set up weekly CI jobs for other architectures, but before I can do that someone needs to merge espressomd/docker#41. |
npt seems to be fixed on aarch64 and i586, but persists on ppc64* (#2468) |
npt
:and
test_elc_vs_mmm2d
:and
test_elc_vs_mmm2d
:Details here
CC @mkuron
The text was updated successfully, but these errors were encountered: