-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LAMMPS return nan for potential energy, tot energy, pressure, etc #43
Comments
Hi @mhsiron |
Hi @YuuuXie thanks for your reply. It does appear to be related to Kokkos -- running without newton/kokkos appears to output computed energies. What might make the first potential work with kokkos that does not work with the second potential? Thanks for your help! |
Hi Martin, This is puzzling. So with the same executable, it works with one potential file, but not the other? What command do you use to run LAMMPS? |
Exactly! I am using: Happy to provide the potentials as well (attached). The one that gives me the problem is the one where I used 7A as the cutoff -- my only guess is it could have to do with memory? The 7A must be larger to process. But that's just my guess... |
That command looks good. Do you happen to have your input structure as well? Does it have any weird features like isolated atoms? Note that we are in the process of merging this FLARE++ code into the main FLARE repository, it is currently on the development branch: https://github.com/mir-group/flare/tree/development There's a chance there are some bugfixes in there that were done after migrating from flare_pp, so you could try to use the files in https://github.com/mir-group/flare/tree/development/lammps_plugins |
Here's the input structure. It does not have any isolated atoms. I will work on building with the development branch to see if that removes this error! |
I am unable to compile LAMMPS with the development branch of Flare:
|
I see now that your output contains
which indicates that you've set the environment variable As for the build errors on the development branch, you may have to first uninstall the old patch from this directory, |
Hi @anjohan even with the MAXMEM=10, I get the following output:
It seems Flare ignores MAXMEM<12? As for compiling this was on a fresh LAMMPS download, but for 29Sep2021 update 3. I just tried it on 17Feb2022 and it appears to fail at the same error:
|
Here is the full error output in VERBOSE=1 mode:
|
On LAMMPS stable_23Jun2022, it appears to not have this issue, so far. Will update if compile finishes. Perhaps a new version of LAMMPS has the following:
|
Confirmed that LAMMPS stable_23June2022 compiled with development branch of Flare LAMMPS plugin with Kokkos. With the same CMAKE compiler tags I used it did not work with 17Feb2022 or 29Sep2021 Update 3. In terms of my original problem, the "7A" flare PP potential, the Flare development branch did not fix this:
|
@mhsiron Have you tried running it with Kokkos in either Serial or OpenMP mode? Also: Your 7 Å potential link has expired, so I'm not able to download it. |
Hi @mhsiron , Sorry that I didn't get around to following up on this.
|
Hi @anjohan , No worries -- completely understand! Lets try round 3 with the potential link! Here it is (looks like this service is 30 days): link. I will attempt the recompile -- I am working with my cluster to upgrade our CUDA to be on par with later PyTorch versions first to see if this may fix some issues. Will update soon. |
Ok some good news -- after recompiling with Kokkos OpenMP mode, I am able to run the 7A potential with Kokkos with decent performance (though not as good as the 4A one with 2x the amount of atoms, and Kokkos+GPU). So it does appear to be related more specifically to Kokkos with GPU. Though I'm still not sure why the 4A worked and the 7A didn't. |
I used the following compile cmd: This was with GCC 12.1, CMake 3.25 |
I am still waiting on the cluster to upgrade CUDA to either 11.3 or 11.6 to see if this resolves the problem for Kokkos with CUDA support. |
After upgrading to CUDA 11.6, and recompiling, I am still reporting the same error with Kokkos+GPU. Only Kokkos+OMP seems to work for 7A potential. 4A potential has no issue on either compilations. |
Hi @mhsiron , Thank you for investigating further! I finally got around to running this on my own setup, and it does indeed produce |
Hm, it appears that when the cutoff (and thus the number of neighbors) grows too large, CUDA simply doesn't launch the radial and spherical harmonics basis calculation, giving no warning whatsoever. For now, you can get it to run by replacing the second |
I have a Flare++ generated LAMMPS potential with the following header:
I have previously compiled LAMMPS with Flare++ and successfully ran MD calculation with potential with the following header:
I cannot get LAMMPS to run properly with the new potential -- it does not seem to be able to calculate the energies:
I have a system with a V100 and 100GB of memory.
My input file is as so:
I am unsure how to troubleshoot.
The text was updated successfully, but these errors were encountered: