Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtual sites missing from gradient calculations #486

Open
jthorton opened this issue Dec 15, 2022 · 8 comments
Open

Virtual sites missing from gradient calculations #486

jthorton opened this issue Dec 15, 2022 · 8 comments
Labels
needs-info Needs more information from user(s)

Comments

@jthorton
Copy link
Contributor

I think that virtual sites are currently missing from property gradient calculations but should be included if present. Currently, evaluator tries to make a system with a reduced number of force field parameters when calculating the gradients of properties here and tries to make sure that any interdependencies between parameters are accounted for but virtual sites seem to be missed. I think they should be included and it would be great if not just that handler is included but also any subclasses of it are as well, this is related to the custom vsite handler we currently use in QUBEKit and this fix in the toolkit might be relavent here.

@mattwthompson
Copy link
Member

Could you write up a minimal reproduction of this? My guess is that Interchange is failing to create some of the virtual sites, but it's also possible that something is misfiring with the gradient keys.

@jthorton
Copy link
Contributor Author

Just bumped into this again when trying to fit to mixture properties including water. Here is a small example

"Make a sub system with evaluator for a mixture with a water 4 point vsite model"

from openff.evaluator.utils.openmm import system_subset
from openff.toolkit import Molecule, Topology, ForceField
from openff.evaluator.forcefield import ParameterGradientKey
import openmm

# make a topology for a mixture of ethane and water
top = Topology.from_molecules([Molecule.from_smiles("O")])

ff = ForceField("tip4p_fb-1.0.0.offxml")

p_key = ParameterGradientKey(tag="VirtualSites", smirks="[#1:2]-[#8X2H2+0:1]-[#1:3]", attribute="distance")

system, p_valu = system_subset(parameter_key=p_key, force_field=ff, topology=top, scale_amount=0.2)

with open("system.xml", "w") as output:
    output.write(openmm.XmlSerializer.serialize(system))

File "/Users/joshua/Documents/Projects/qube/evaluator/HCNO-v1/evaluator_vsite_testing/create_subsystem.py", line 15, in
system, p_valu = system_subset(parameter_key=p_key, force_field=ff, topology=top, scale_amount=0.2)
File "/Users/joshua/mambaforge/envs/eval_new/lib/python3.9/site-packages/openff/evaluator/utils/openmm.py", line 250, in system_subset
system = force_field_subset.create_openmm_system(topology)
File "/Users/joshua/mambaforge/envs/eval_new/lib/python3.9/site-packages/openff/utilities/utilities.py", line 80, in wrapper
return function(*args, **kwargs)
File "/Users/joshua/mambaforge/envs/eval_new/lib/python3.9/site-packages/openff/toolkit/typing/engines/smirnoff/forcefield.py", line 1164, in create_openmm_system
return self.create_interchange(
File "/Users/joshua/mambaforge/envs/eval_new/lib/python3.9/site-packages/openff/utilities/utilities.py", line 80, in wrapper
return function(*args, **kwargs)
File "/Users/joshua/mambaforge/envs/eval_new/lib/python3.9/site-packages/openff/toolkit/typing/engines/smirnoff/forcefield.py", line 1223, in create_interchange
return Interchange.from_smirnoff(
File "/Users/joshua/mambaforge/envs/eval_new/lib/python3.9/site-packages/openff/interchange/components/interchange.py", line 270, in from_smirnoff
return _create_interchange(
File "/Users/joshua/mambaforge/envs/eval_new/lib/python3.9/site-packages/openff/interchange/smirnoff/_create.py", line 117, in _create_interchange
_electrostatics(
File "/Users/joshua/mambaforge/envs/eval_new/lib/python3.9/site-packages/openff/interchange/smirnoff/_create.py", line 253, in _electrostatics
raise MissingParameterHandlerError(
openff.interchange.exceptions.MissingParameterHandlerError: Force field contains parameter handler(s) that may assign/modify partial charges, but no ElectrostaticsHandler was found.

so I think that all electrostatic handlers and subclasses of them should be included when we try to fit any electrostatic parameter.

@mattwthompson
Copy link
Member

mattwthompson commented Jul 31, 2023

If I understand your suggestion correctly, is the idea is add this logic to system_subset? Or should Interchange's behavior around this case be changed?

@jthorton
Copy link
Contributor Author

jthorton commented Aug 1, 2023

I think the logic should be in system_subset but will defer to you here.

I think the main difficulty is finding all the related parameters to the target. For example, fitting the v-site properties in mixtures will require the AM1BCC charge handler if that was used to parameterise the other molecules and when plugin handlers are involved it becomes more complicated again, how would we ensure we have included all present plugins which are related?

@mattwthompson
Copy link
Member

I think I misunderstood the problem originally; the error bubbles up from Interchange's rule about needing an ElectrostaticsHandler when anything might get a charge, so either Evaluator would need to always provide that (and probably other handlers that assign methods) or Interchange's rule about this behavior would need to change.

Does the gradient-fitting routine work when a "subset" still contains so many handlers? Here's where my lack of actual knowledge of the code and science limits me.

@jthorton
Copy link
Contributor Author

so either Evaluator would need to always provide that (and probably other handlers that assign methods) or Interchange's rule about this behavior would need to change.

Evaluator would need to provide all present handlers which control the same interaction as the one we want the gradient of including plugins. I can see two possible ways we could do this:

    1. only use the system subset when there are no plugins present. We can hard code the relationships between the parameters so include all used electrostatic handlers when the gradient is requested for one of them and include every handler when a plugin is present in the system.
    1. Have an attribute on each handler to indicate which interaction the handler controls so all handlers which control the same interaction type as the gradient parameter can be included.

Does the gradient-fitting routine work when a "subset" still contains so many handlers?

I think so if we go with case two even though we have to compute all the electrostatics we still get to skip all bond angle and torsion terms which should help.

@mattwthompson
Copy link
Member

I think there's a number of things that we could do here, and I'm not sure which to pick. Here's what I'm able to come up with now:

  1. Get system slicing working at all when virtual sites are present
  2. Get system slicing working at all when plugins potentially containing handler dependencies are present
    1. and 2.
    1. and/or 2. but in a performant manner

These are ordered in increasing utility and, I figure, also in increasing difficulty.

I've started #534 which at least gets your snippet above running without error. I don't grasp if that's enough to get gradient-based fitting actually working, but it's a start.

@mattwthompson
Copy link
Member

@jthorton have you been able to use this (it should be in the most recent release)? Any feedback on performance or breaking cases?

@mattwthompson mattwthompson added the needs-info Needs more information from user(s) label Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-info Needs more information from user(s)
Projects
None yet
Development

No branches or pull requests

2 participants