Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat]: MD Accelaration and LAMMPS Integration #66

Open
1 of 4 tasks
HNUSTLingjunWu opened this issue Dec 20, 2024 · 5 comments
Open
1 of 4 tasks

[Feat]: MD Accelaration and LAMMPS Integration #66

HNUSTLingjunWu opened this issue Dec 20, 2024 · 5 comments

Comments

@HNUSTLingjunWu
Copy link

Contact Details

lingjun.wu@hotmail.com

Feature Description

Dear Developers:

I've tried run MD using Mattersim via ASE, however I found the speed is quite slow, I'm just wondering if there is anyway to accelarate the simulation? By the way, can I integrate Mattersim with LAMMPS? thanks so much.

Best wishes,
Lingjun

Motivation

  1. To accelarate MD simulation;
  2. To integrate with LAMMPS, which is quite fast.

Proposed Solution

No response

Contribution Interest

  • I'm interested in potentially implementing this feature
  • I can provide guidance, but cannot implement
  • I'm just suggesting the feature for the community

Code of Conduct

  • I agree to follow the project's Code of Conduct
@luzihen
Copy link
Contributor

luzihen commented Dec 20, 2024

Lingjun,

What hardware are you running the model on? The speed should be comparable to most GNN-based MLFFs.

We do have a LAMMPS interface. Just did not get the bandwidth to clean up the code and push that to the repo. So, while we welcome contribution from the community, for this case, I'd suggest wait a bit so as not to duplicate the work.

PS, interfacing with LAMMPS likely won't accelerate the simulation as the model inference bit is the most time-consuming part.

@HNUSTLingjunWu
Copy link
Author

HNUSTLingjunWu commented Dec 21, 2024 via email

@JonathanSchmidt1
Copy link

JonathanSchmidt1 commented Jan 6, 2025

Does the lammps interface include possibilities for running larger systems on multiple gpus?
Right now even with H100 the system size is quite limited.
Or do you have any other way to do e.g. offloading from the gpu to cpu to increase systems size.

edit: wrapping the energy calculation in the forward call with torch.autograd.graph.save_on_cpu(pin_memory=True) allows to go to significantly larger systems while still being faster than cpu.

@HNUSTLingjunWu
Copy link
Author

HNUSTLingjunWu commented Jan 31, 2025 via email

@JonathanSchmidt1
Copy link

Dear Lingjun,

Thank you for your response.

I have two questions:

  1. Does your LAMMPS interface support running system that do not fit into the vram of one gpu on multiple GPUs using MPI?
  2. Offloading activations from VRAM to RAM or using pipelining allows for training larger models. In the case of force fields, such techniques could also enable simulations of larger systems, as force calculations for the M3GNet model require backpropagation. If your code does not currently support multiple GPUs, have you explored any offloading strategies to scale to larger systems?

Specifically, this approach keeps calculations on the GPU while transferring intermediate activations to RAM or recomputing them during backpropagation for force calculations. After my initial inquiry, I experimented with a straightforward strategy in PyTorch—wrapping the energy computation during the forward pass with torch.autograd.graph.save_on_cpu(pin_memory=True). This indeed allowed me to run significantly larger systems at an acceptable speed, which was still faster than CPU-based computations.

best wishes,
Jonathan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants