-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
{bio}[foss/2023b] GROMACS v2024.3 #21430
{bio}[foss/2023b] GROMACS v2024.3 #21430
Conversation
@boegelbot please test @ jsc-zen3 |
@bedroge: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... - notification for comment with ID 2355567729 processed Message to humans: this is just bookkeeping information for me, |
Test report by @bedroge edit: oops, forgot to include the fix from easybuilders/easybuild-easyblocks#3283, ran into that before... |
Test report by @boegelbot |
@boegelbot please test @ generoso |
@bedroge: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 2355672238 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot Test failure in GmxapiMpiTests:
Let's try again... |
@boegelbot please test @ generoso |
@bedroge: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 2355854315 processed Message to humans: this is just bookkeeping information for me, |
Test report by @bedroge |
Test report by @boegelbot |
Also tested this with the EESSI bot for a bunch of CPUs: EESSI/software-layer#709. There it also failed on haswell with the same input/output error, so I've started another build. |
Test report by @boegel |
GROMACS dev here. I see that the following test case
fails, either timing out or somehow suspended or crashed. C-rescale is a relatively new implementation, and this test case is intended to exercise dark corners of the code, so a real problem is possible. Yet I see the preceding test case (at https://gist.github.com/boegel/75ff6503735f73f2d9ec570366bd181f#file-gromacs-2024-3-foss-2023b_partial-log-L374) took 25 seconds. On my x86 laptop with a |
@mabraham It's probably not the GROMACS configuration itself, but the environment it's running it. It's running in an interactive Slurm job, with 9 cores available (in a cgroup) out of a total of 36 in total on that system. In addition, I've seen this before, but I never got to the bottom of it for GROMACS... If any of this rings a bell, any insights you may have are welcome. |
The test cases are only using two pthreads, so if the system is working as you describe, there's no ready explanation of a problem. But if the core-to-cgroup mapping is not working right, such slowdowns are plausible. Do you have / can you get data to observe core occupancy across a loaded node? |
Test report by @boegel |
The system I was testing on has been migrated from RHEL 8.8 to RHEL 9.4 since the last time I tested (17 Sept'24), and the last attempt didn't fail (see test report above) One difference there is that this test was done on a full workernode (all 36 cores assigned to the Slurm job), so there's no cgroup effect here. I also did an I'm now retesting in a 9-core Slurm job, where cgroup is set up such that available cores are spread across the node:
I didn't see any failing tests after an If I keep
Seems to be the same That's not a total surprise though, we've seen other situations where setting So long story short: friends don't let friends set @mabraham Not sure if it makes sense to integrate the "unset |
It certainly makes sense for you to integrate that unset call in your runner. By default, GROMACS does try to respect existing thread-affinity settings, but if it detects none, then it sets them itself. The main simulation engine itself has a command-line flag to specify behavior here, but the these test binaries just do the default. However the default only checks GOMP_CPU_AFFINITY, and no OMP_* variables, which looks like an omission. |
I made https://gitlab.com/gromacs/gromacs/-/issues/5170 to follow up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Going in, thanks @bedroge! |
(created using
eb --new-pr
)Compared to previous easyconfigs, this now installs the pypi version of gmxapi. The versioning of the included gmxapi seems a bit confusing: https://gitlab.com/gromacs/gromacs/-/blob/v2024.3/python_packaging/gmxapi/pyproject.toml?ref_type=tags says 0.4.1, https://gitlab.com/gromacs/gromacs/-/blob/v2024.3/python_packaging/gmxapi/src/gmxapi/version.py?ref_type=tags shows 0.5.0a1, and the docs just recommend using the pypi version (where the latest version is 0.4.2).