Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting velocities trigger an exception #61

Closed
raimis opened this issue Jan 25, 2022 · 1 comment · Fixed by openmm/openmm#3424
Closed

Setting velocities trigger an exception #61

raimis opened this issue Jan 25, 2022 · 1 comment · Fixed by openmm/openmm#3424
Assignees
Labels
bug Something isn't working

Comments

@raimis
Copy link
Contributor

raimis commented Jan 25, 2022

I tried to worn on #59 and got into an issue.

This is a simplified script:

import numpy as np
import torch as pt
from openmm import Context, Platform, System, VerletIntegrator
from openmmtorch import TorchForce

class NNP(pt.nn.Module):
  def __init__(self):
    super().__init__()

  def forward(self, positions):
    return pt.sum(positions)

pt.jit.script(NNP()).save('model.pt')

num_atoms = 3
system = System()
for _ in range(num_atoms):
    system.addParticle(1.0)
system.addForce(TorchForce('model.pt'))

integrator = VerletIntegrator(1.0)

context = Context(system, integrator, Platform.getPlatformByName('CUDA'))
context.setPositions(np.random.rand(num_atoms, 3))
context.setVelocitiesToTemperature(300)

integrator.step(1000)

which triggers an exception:

Warning: importing 'simtk.openmm' is deprecated.  Import 'openmm' instead.
Traceback (most recent call last):
  File "test.py", line 25, in <module>
    context.setVelocitiesToTemperature(300)
  File "/shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/openmm/openmm.py", line 16068, in setVelocitiesToTemperature
    return _openmm.Context_setVelocitiesToTemperature(self, *args)
openmm.OpenMMException: The autograd engine was called while holding the GIL. If you are using the C++ API, the autograd engine is an expensive operation that does not require the GIL to be held so you should release it with 'pybind11::gil_scoped_release no_gil;'. If you are not using the C++ API, please report a bug to the pytorch team.
Exception raised from execute at /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1635005512693/work/torch/csrc/autograd/python_engine.cpp:123 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6a (0x7f1114a3198a in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0xd4 (0x7f1114a2d494 in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #2: torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&) + 0xb8 (0x7f11606ff718 in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: <unknown function> + 0x343921a (0x7f115c56f21a in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #4: torch::autograd::backward(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, c10::optional<bool>, bool, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) + 0x6a (0x7f115c5704da in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #5: <unknown function> + 0x349cb77 (0x7f115c5d2b77 in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #6: at::Tensor::_backward(c10::ArrayRef<at::Tensor>, c10::optional<at::Tensor> const&, c10::optional<bool>, bool) const + 0x4a (0x7f1159d156ea in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #7: TorchPlugin::CudaCalcTorchForceKernel::execute(OpenMM::ContextImpl&, bool, bool) + 0xe3f (0x7f10d0a8408f in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/plugins/libOpenMMTorchCUDA.so)
frame #8: OpenMM::ContextImpl::calcForcesAndEnergy(bool, bool, int) + 0xc9 (0x7f10d2d0f6e9 in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/openmm/../../../libOpenMM.so.7.7)
frame #9: OpenMM::Context::setVelocitiesToTemperature(double, int) + 0xd5 (0x7f10d2d0cdc5 in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/openmm/../../../libOpenMM.so.7.7)
frame #10: <unknown function> + 0x101324 (0x7f10d3089324 in /shared2/raimis/opt/miniconda/envs/ot_tut/lib/python3.7/site-packages/openmm/_openmm.cpython-37m-x86_64-linux-gnu.so)
<omitting python frames>
frame #25: __libc_start_main + 0xf5 (0x7f119c3c6555 in /lib64/libc.so.6)

The issue happens when setting setting velocities:

context.setVelocitiesToTemperature(300)

If I comment that line, the script works.

@raimis
Copy link
Contributor Author

raimis commented Jan 25, 2022

So, OpenMM::Context::setVelocitiesToTemperature calls OpenMM::ContextImpl::calcForcesAndEnergy and as result the PyTorch model.

The GIL is released only for step, getState, and minimize (openmm/openmm#3061), but not for setVelocitiesToTemperature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant