-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CHARMM: Handle patches #22
Comments
@jchodera, I'm running into a problem when writing out the patches in the format that @peastman had documented here This is what the patch looks like in CHARMM.
For the first two problems, we can probably devise a way to use the yaml file to find the template of the residues the patch will be patched to to figure out which atoms are changed rather than added and which bonds to remove, but that would mean using the yaml file in Parmed. Maybe @swails can offer some insight? |
@peastman : We hadn't realized that this was going to be so problematic when the The most straightforward solution here would be to modify the OpenMM
We discussed the alternatives---engineering a separate pipeline or trying to massively overhaul ParmEd---and think that it will set us back a few weeks if we need to go that route. We need to fix the |
Since we're only including a handful of patches (NTER, CTER, DISU, and is that it?), what about just writing them by hand? That should be trivial to do. If we want to support a large set of them in the future, we can worry about getting ParmEd to generate them at that point. |
@peastman: There's a few issues with manual generation of patches:
@ChayaSt and I feel the best thing to do is to make these changes to the |
It actually does. I may have forgotten to mention that in the documentation. Sorry if I did! If you specify
It already does that too.
I do see how this would simplify the process of automatically converting the CHARMM files, and let's discuss ways of handling it. But since we're only talking about three patches that can trivially be converted by hand, this isn't actually blocking anything, is it? |
It's actually more than 3 since I'm including the common termini (total of 8), specific termini for |
Ok, let's try to do something automated then. Could the YAML file list the atoms that are being changed rather than added? I'm just trying to figure out a way to avoid removing error checking from OpenMM. Adding a new atom and changing an existing atom are fairly different operations, and making the user specify in the force field XML which one they're doing could catch a lot of user errors. I want to keep that protection if possible. |
We don't actually have any way to know exactly which atoms and bonds should be deleted since the charmm patches do not provide this information, so there is no fundamentally superior way to check for errors in this case. What if the error checking was skipped only for patches that have a "behavior=charmm" attribute? |
That can be tough because the CHARMM template doesn't differentiate. All it gives you is a list of atoms from which some are changed and some are added. |
I know, you'd have to add that information yourself. For example, the NTER patch just lists six atoms: N, HT1, HT2, HT3, CA, and HA. It doesn't say which are added and which are changed. But you know that HT1, HT2, and HT3 are being added while N, CA, and HA are being changed. That's why I suggested putting it in the yaml file. To put it differently: CHARMM really ought to provide that information, and it's a flaw in their design that it doesn't! OpenMM's design is better. I'd rather keep it that way, even if it means having to fill in the missing information by hand at conversion time, rather than force OpenMM to adopt the same design flaws that CHARMM has. |
Except we don't know that for certain. We do not have that information available, and the only way to know what atoms are deleted is to discern this from a particular combination of patch with residue that should be allowed. We also believe there are cases where the atoms that are deleted will be different for different target residues. I understand you have issues with how CHARMM represents this information, but I also understand that you would like to make the CHARMM forcefield available in OpenMM. In the end, there is always a compromise that must be made, and we're suggesting a logical compromise that mimics the behavior of CHARMM is the right compromise here. |
Also, CHARMM clearly permits any of these combinations, meaning they are utterly valid in the CHARMM world. Who are we to say what is or isn't appropriate in the CHARMM world? |
We do know that. Remember that the XML file has to explicitly specify what patches can be applied to what templates. They don't get combined in arbitrary ways, only in the ways we specifically enumerate. A "patch" in OpenMM is something very different from a "patch" in CHARMM, even though we use the same name for both of them. In retrospect maybe we should have chosen a different name to avoid that confusion. A CHARMM patch is a set of rules for modifying residues. The user tells the program, "Apply this patch to this residue," and it makes the modifications. An OpenMM patch doesn't modify anything. The system already has whatever atoms and bonds it's going to have, and we don't change that. The patch is a rule for generating new templates for parameter assignment. The user doesn't tell it what patches to apply. Instead, it automatically figures out what patched template is needed to match a residue. So even though the OpenMM "patches" are (mostly) generated from the information contained in CHARMM "patches", they're not the same thing. Sometimes we'll need to supply extra information. Sometimes there may not be a 1:1 correspondence between CHARMM patches and OpenMM patches. That's not a problem. But if we try to think of them as actually the same thing, that will lead us into confusion. |
An example of this it PRO. The N there is type N rather than NH1. While it has its own NTER and ACP patch, it doesn't have its own NNEU where the N is changed to an NH2 from either an NH1 for the other residues and from N for PRO. |
That means we'll need to create one version of the patch for PRO, and a different one for all the other amino acids. Not a problem. The force field explicitly specifies what patches can be applied to what templates. |
No, we don't know that for CHARMM because the information is not provided. What you suggest amounts to preenumerating all possible combinations of patches and residues, which is what the Patch tag was meant to avoid. I don't think this is a solution that will gracefully scale to the whole charmm forcefield. Someone may be able to hack it together manually with a few days of effort (we won't be doing that here), but we won't easily be able to extend this later. The Patch scheme was implemented to make CHARMM work. There is an easy way to fix the implementation to make CHARMM work with minimal effort without compromising the OpenMM error checking for anything that isn't already permissible by CHARMM. I'm not sure why there is resistance to this. We will have to circle back and schedule a discussion by Skype for Monday or Tuesday of next week to sort out remaining issues. Today is unfortunately packed. |
No, that is exactly what the patch tag requires. Please look at the spec (which you agreed to when it was implemented!) to see how it works. You're making claims that just aren't correct. You either have to include an |
@peastman : I think there's still a bit of confusion over what we're proposing to change. We can still provide and require this---it's necessary to prevent combinatorial explosion for OpenMM---but we are suggesting that the determination of which atoms are changed vs which atoms are added should be made automatically at the time of Patch application. If we don't do this, we will need to pre-enumerate all possible combinations and split out patches into multiple patches automatically before creating the Patch records in a manner that I don't think will scale. You were worried that we are eliminating error checking by not explicitly specifying which atoms are added vs changed in the patch---information the CHARMM PRES doesn't provide---but @ChayaSt discovered CHARMM adds a different kind of sanity check: The net charge change. The
|
Yep, that's exactly what I'm proposing. There's missing information in the CHARMM file. We need to fill that in. Do we do that at force field conversion time, or at system construction time? In the first case, it happens in code that's run exactly once, when the force field is converted. We can check over the results carefully, validate them thoroughly, make sure everything is correct. In the second case, it's code that runs every single time a user calls It seems pretty obvious to me which one is the better approach.
You have to do that regardless. The XML file explicitly enumerates all allowed combinations. That's remains true either way. |
You're asking us to explicitly enumerate not only all allowed combinations, but also break these up into classes that require different sets of atom additions and atom alterations. Right now, this can only be done by manually inspecting each (PRES, RESI) pair for something like 25 patches. Let's rule out the manual approach. Even if it was something reasonable this time, it won't scale in the future. Let me look into what ParmEd might do to facilitate the automated enumeration of combinations and their segregation into sub-patch types depending on which sets of atoms are added vs altered. If you're concerned about safety checks, I presume we'll also implement the total charge change safety check into OpenMM as well? |
@ChayaSt : Does ParmEd handle the storage and writing of patches yet, or do we need to add that as well? I tried checking the list |
ParmEd does "handle" patches (see here), but currently it just keeps a list of atoms that need deleting in addition to the information stored in a residue template. That was the extent of how I was using patches at the time, so if it needs to be extended (as I suspect it does), then it can be. |
Thanks, @swails! Any idea where the We're talking with @peastman today to figure out exactly what needs to be done with patch writing for OpenMM, then will open a PR with the proposed changes! |
@swails: One more question: In CHARMM, patches ( |
@peastman : I think we can, within ParmEd, enumerate all "safe" combinations of patches using the total charge as a sanity check to exclude impermissible combinations. So listing residue compatibilities shouldn't be problematic. I think we have to decide whether to add to |
If only specifying one of them is the right thing to do, which one is it? |
I can't come up with compelling arguments for one way vs the other, which is why I'm thinking that specifying only one of them isn't the right thing to do. |
It doesn't matter. I just included both options for flexibility when combining multiple files. If you create a file that adds parameters for a new amino acid, you can use I'd say to go with |
OK, but can you add a warning to the documentation that |
Even better, I'll make it filter the duplicates. But you should also remove the duplicates from the file. That will reduce the size by many KB! |
Weren't you just commenting on how the 55 TB of data taken up by uncompressed I'm working on generating the new split files now. |
No, it was the 4 MB force field XML file I said wasn't significant. You should totally compress all your data! For reading them, just replace |
We're still struggling to figure out how to do that with the FAH cores. |
OK, I've now checked that
Here are the newly converted files: I'm still working to add more tests to both ParmEd and the conversion scripts, but I thought I'd share the files in their current state in case they turn out to be useful (or you catch bugs that I haven't seen yet). |
Yeah, that's another problem. Though isn't that really an issue on the server end, rather than the core? |
Ideally not. If the server has to do the compressing/decompressing, that's enormously more expensive than having the core uncompress when reading due to WU throughput. Having transparent reading/writing of compressed XML files at the OpenMM level would allow the cost of this to be amortized over the many available donor machines. |
It's possible that |
@peastman : I generated a membrane protein system by following the OpenMM 7 B2AR tutorial, and discovered another issue:
Minimal test attached: In this case, it looks like Tagging @ChayaSt |
I excluded that one stream file with D-amino acids and ended up with a different error:
Is it chaining multiple patches together? Test case attached: |
I believe that's the case. I remember Joseph talking about how compression and/or uncompression was taking a lot of time on the server, and he was looking into switching to a less expensive compression method. So it seems that's time that's already being spent.
If a residue doesn't match any template, it considers all possible combinations of single-residue patches it could apply. In this case, it's finding multiple combinations that could match, and it doesn't know which to pick. FHEM is the one I pointed out to you before that does nothing and can be applied to any residue. DELB is another one like it. They're confusing it, because it has no way to decide whether to apply those patches or not. |
I can certainly have the |
The right solution is not to include those patches in the XML file in the first place. A patch that does nothing makes no sense. |
That's almost, but not quite, the right solution. CHARMM permits patches that change the charges, atom types, or explicit impropers, but OpenMM cannot automatically perceive these for us. I think we want users to be able to select those patches manually if needed, but the automatic patch perception framework can't deal with them. In light of these, it would seem we want to include all patches, but only specify patches as compatible with a residue (for purposes of automated residue x patch expansion) if they modify the topology of the residue template. Does that sound right to you, @peastman? |
Those two patches literally do nothing. Here's the definition of one of them: <Patch name="FHEM">
</Patch> There's no benefit to including that in any form. We should remove them. A patch that changes atom types but does nothing else is an interesting case. Assuming the two atom types both correspond to the same element, there's no way it could automatically decide whether to use the patch or not. So it could only be useful if we had some mechanism for explicitly telling it to use a patch. Currently, we don't have any such mechanism. We could consider adding one in the future, though we'd need to consider carefully what the use cases are for it. |
I think I mentioned before that they do something important in the CHARMM world:
I'm surprised that
We definitely need a mechanism for telling One such example, I believe, is patches that change only the charges of titratable residues for constant-pH dynamics. |
It'll detect them automatically. |
It looks like I have to exclude all D-amino acids from the conversion because they match all the L-amino acids:
|
I've made the changes I think are necessary, but got bogged down converting the |
I'm still having trouble with the CHARMM conversion: lski2414:charmm choderaj$ python
Python 3.6.1 |Continuum Analytics, Inc.| (default, May 11 2017, 13:04:09)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from simtk.openmm import app
>>> pdbfile = app.PDBFile('tests/POPC.pdb')
>>> ff = app.ForceField('ffxml/charmm36.xml')
>>> system = ff.createSystem(pdbfile.topology)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/choderaj/miniconda/lib/python3.6/site-packages/simtk/openmm/app/forcefield.py", line 1270, in createSystem
force.createForce(sys, data, nonbondedMethod, nonbondedCutoff, args)
File "/Users/choderaj/miniconda/lib/python3.6/site-packages/simtk/openmm/app/forcefield.py", line 2388, in createForce
reverseMap[typeMap[typeValue]] = typeValue
IndexError: list assignment index out of range I'm not quite sure what's causing this failure, but it also took several minutes for the |
That last error was fixed in openmm/openmm#1915 I think this is finished now! Will reopen if we run into more trouble. |
Since the CHARMM forcefield only specified patch/residue compatibility with (barely) human-readable comments (see #1 (comment)), we plan to create a YAML file to direct which ffxml compatibility tags are to be inserted.
The initial version will cover
NTER
,CTER
, ACE/NME)The syntax will look something like
In future iterations, we will expand this compatibility list.
The text was updated successfully, but these errors were encountered: