Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix LAMMPS 3Mar2020 easyconfigs using intel toolchain on AMD CPUs by patching out hardcoded -xHost #11577

Merged
merged 5 commits into from
Sep 3, 2021

Conversation

hajgato
Copy link
Collaborator

@hajgato hajgato commented Oct 26, 2020

No description provided.

@boegel boegel added the bug fix label Oct 26, 2020
@boegel boegel added this to the 4.3.1 (next release) milestone Oct 26, 2020
@boegel boegel changed the title fix LAMMPS-3Mar2020-intel installations on AMD CPUs fix LAMMPS 3Mar2020 easyconfigs using intel toolchain on AMD CPUs by patching out hardcoded -xHost Oct 26, 2020
@boegel
Copy link
Member

boegel commented Oct 26, 2020

Test report by @boegel
FAILED
Build succeeded for 1 out of 2 (2 easyconfigs in total)
node3570.doduo.os - Linux RHEL 8.2, x86_64, AMD EPYC 7552 48-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/9fe4c2e977155154f936e40ea78681ae for a full test report.

@hajgato
Copy link
Collaborator Author

hajgato commented Oct 27, 2020

@boegel

BlockingIOError: [Errno 11] Resource temporarily unavailable: '/kyukon/home/gent/400/vsc40003/.local/share/virtualenv/seed-app-data/v1.0.1/3.8/image/CopyPipInstall/pip-20.0.2-py2.py3-none-any/pip/_vendor/distlib/w32.exe' -> '/tmp/vsc40003/easybuild/LAMMPS/3Mar2020/intel-2020a-Python-3.8.2-kokkos/easybuild_obj/docenv/lib/python3.8/site-packages/pip/_vendor/distlib/w32.exe'

@hajgato
Copy link
Collaborator Author

hajgato commented Oct 27, 2020

@boegel plus the patch does not work.

@hajgato
Copy link
Collaborator Author

hajgato commented Oct 27, 2020

@boegel I do not have any idea how could I fix this BlockingIOError: [Errno 11] problem on RHEL8

@boegel
Copy link
Member

boegel commented Oct 27, 2020

@hajgato My home dir was close to being full, that may be the problem... I'll try again.

@boegel
Copy link
Member

boegel commented Oct 27, 2020

@boegelbot please test @ generoso

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on generoso

PR test command 'EB_PR=11577 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_11577 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 8204

Test results coming soon (I hope)...

- notification for comment with ID 717460098 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
generoso-x-1 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/6dd8083c70dd5eb09aaa1e011c2f884f for a full test report.

@hajgato
Copy link
Collaborator Author

hajgato commented Oct 27, 2020

Adding configopts = "-DBUILD_DOC=off " solves the problem. I have the feeling that this problem specifically relates to our system.

@boegel
Copy link
Member

boegel commented Oct 27, 2020

Test report by @boegel
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
node3407.kirlia.os - Linux centos linux 7.8.2003, x86_64, Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz (cascadelake), Python 2.7.5
See https://gist.github.com/b0134cb5b42f8b3d8921866120d781a6 for a full test report.

@ocaisa
Copy link
Member

ocaisa commented Oct 28, 2020

Looks like there is no need for the patch, you can use the CMake configopt -DCMAKE_TUNE_DEFAULT="" which should have the same effect (not tested!)

My bad, didn't look at the patch hard enough

@boegel
Copy link
Member

boegel commented Nov 11, 2020

@hajgato That BlockingIOError is a GPFS bug, for which a workaround is available by patching Python, right? Do we have that documented somewhere?

@hajgato
Copy link
Collaborator Author

hajgato commented Nov 12, 2020

@boegel The workaround is documented in #11581

@boegel
Copy link
Member

boegel commented Sep 3, 2021

@boegelbot please test @ generoso
CORE_CNT=16

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on generoso

PR test command 'EB_PR=11577 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_11577 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 18280

Test results coming soon (I hope)...

- notification for comment with ID 912537738 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
generoso-x-1 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/1b52861e29c1e1a6d4bc74b1bef72c92 for a full test report.

@boegel
Copy link
Member

boegel commented Sep 3, 2021

Test report by @boegel
SUCCESS
Build succeeded for 9 out of 9 (4 easyconfigs in total)
node2605.swalot.os - Linux centos linux 7.9.2009, x86_64, Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz (haswell), Python 3.6.8
See https://gist.github.com/f45cfd25455bb97d7203e7f71fb291ab for a full test report.

@boegel
Copy link
Member

boegel commented Sep 3, 2021

Test report by @boegel
SUCCESS
Build succeeded for 8 out of 8 (4 easyconfigs in total)
node3528.doduo.os - Linux RHEL 8.2, x86_64, AMD EPYC 7552 48-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/8c99b8e8e837037d70eb6d2d5ce436d4 for a full test report.

Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Member

boegel commented Sep 3, 2021

Going in, thanks @hajgato!

@boegel boegel merged commit d742c31 into easybuilders:develop Sep 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants