Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formal charges #831

Merged
merged 16 commits into from
Mar 12, 2022
Merged

Formal charges #831

merged 16 commits into from
Mar 12, 2022

Conversation

aerkiaga
Copy link
Collaborator

@aerkiaga aerkiaga commented Mar 9, 2022

This Pull Requests improves Avogadro's support for formal charges. First of all, it implements reading them from CML, which also extends to other formats (e.g. SMILES, which is internally converted to CML). Those charges are then passed down the class hierarchy. Secondly, since PDB files can contain quaternary ammonium (e.g. acetylcholine, various drug ligands) or tertiary sulfonium (S-adenosylmethionine), or even quaternary arsonium (https://www.rcsb.org/structure/5NXY) groups, and those cannot be eliminated by hydrogen adjustment, specific code was added to detect those.

Then, the code was made more convergent with other paths so as to allow the various optimization PRs to act on it.

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or

(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or

(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.

(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.

Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2022

Here are the build results
Avogadro2.AppImage
macOS.dmg
Ubuntu-2004.tar.gz
Win64.exe
Artifacts will only be retained for 90 days.

@ghutchis
Copy link
Member

ghutchis commented Mar 9, 2022

The other part of this would be writing to CJSON since we're now using that preferentially over CML.

See io/cjsonformat.cpp

Copy link
Member

@ghutchis ghutchis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the question would be more on the generality - whether it's worth expanding beyond the simple cations to a slightly larger formal charge perception method.

I also don't remember if I added formal charges to the native mdlformat code -- it looks like not.

avogadro/core/molecule.cpp Outdated Show resolved Hide resolved
avogadro/core/molecule.cpp Outdated Show resolved Hide resolved
avogadro/core/molecule.cpp Outdated Show resolved Hide resolved
avogadro/core/molecule.cpp Outdated Show resolved Hide resolved
@ghutchis
Copy link
Member

ghutchis commented Mar 9, 2022

This fixes your problems for sure, but it would be great to have a slightly broader patch to address CJSON and SDF support too and ideally also handle negative formal charges.

@aerkiaga
Copy link
Collaborator Author

Regarding CJSON and SDF... How are new features introduced in CJSON; can I just send a Pull Request to them? Also, I'm not familiar with SDF at all; never used it. Where could I find the specification?

@ghutchis
Copy link
Member

For CJSON, you're just adding an extension for atom formal charges, so it's backwards-compatible still.

In CjsonFormat::read - I'd do something like this for formal charges:

  json labels = atoms["labels"];
  if (labels.is_array() && labels.size() == atomCount) {
    for (size_t i = 0; i < atomCount; ++i) {
      molecule.atom(i).setLabel(labels[i]);
    }
  }

In write I'd look at something like:

  // labels
  json labels;
  for (size_t i = 0; i < molecule.atomCount(); ++i) {
    labels.push_back(molecule.label(i));
  }
  root["atoms"]["labels"] = labels;

@ghutchis
Copy link
Member

For SDF, there are two parts - in the atom block itself and the M CHG parts:

atom block:

xxxxx.xxxxyyyyy.yyyyzzzzz.zzzz aaaddcccssshhhbbbvvvHHHrrriiimmmnnneee

The "ccc" part is the formal charge code (characters 36-38): 0 = uncharged or other, 1 = +3, 2 = +2, 3 = +1, 5 = -1, 6 = -2, 7 = -3
(this is for backwards compatibility - M CHG if present take priority)

M CHG fields indicate the number of formal charges on the line (up to 8), then the atom number and the charge, e.g.

M  CHG  2   8  -1  10   1

So 2 atoms with charges, atom 8 has a -1 charge, atom 10 has a +1 charge.

test.txt

@github-actions
Copy link
Contributor

Here are the build results
Avogadro2.AppImage
macOS.dmg
Ubuntu-2004.tar.gz
Win64.exe
Artifacts will only be retained for 90 days.

Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
avogadro/io/cmlformat.cpp Outdated Show resolved Hide resolved
@github-actions
Copy link
Contributor

Here are the build results
Avogadro2.AppImage
macOS.dmg
Ubuntu-2004.tar.gz
Win64.exe
Artifacts will only be retained for 90 days.

Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
@github-actions
Copy link
Contributor

Here are the build results
Avogadro2.AppImage
macOS.dmg
Ubuntu-2004.tar.gz
Win64.exe
Artifacts will only be retained for 90 days.

avogadro/io/mdlformat.cpp Show resolved Hide resolved
avogadro/io/mdlformat.cpp Outdated Show resolved Hide resolved
avogadro/io/mdlformat.cpp Outdated Show resolved Hide resolved
@ghutchis
Copy link
Member

Otherwise, this looks good. I was able to read in the test file, and adding hydrogens generated R-NH3+ and R-CO2- properly, which is great.

Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
@github-actions
Copy link
Contributor

Here are the build results
Avogadro2.AppImage
macOS.dmg
Ubuntu-2004.tar.gz
Win64.exe
Artifacts will only be retained for 90 days.

@ghutchis
Copy link
Member

Thanks, the SDF fix looks good. Anything else or is this good to merge?
Screen Shot 2022-03-11 at 2 51 11 PM

@ghutchis ghutchis added the enhancement feature changes / API changes label Mar 11, 2022
@aerkiaga
Copy link
Collaborator Author

Well, there's that thing with halogen-substituted anions, but at this point it would only be added to PDB, and I'm not aware of any drugs or research ligands with such structures...

So... yes, good to merge!

@aerkiaga
Copy link
Collaborator Author

aerkiaga commented Mar 12, 2022

Well, maybe I should add support in CJSON? Looks small enough to me that you could merge it with that support tomorrow...

Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
@github-actions
Copy link
Contributor

Here are the build results
Avogadro2.AppImage
macOS.dmg
Ubuntu-2004.tar.gz
Win64.exe
Artifacts will only be retained for 90 days.

Signed-off-by: Aritz Erkiaga <aerkiaga3@gmail.com>
@github-actions
Copy link
Contributor

Here are the build results
Avogadro2.AppImage
macOS.dmg
Ubuntu-2004.tar.gz
Win64.exe
Artifacts will only be retained for 90 days.

@ghutchis ghutchis merged commit ab9fd54 into OpenChemistry:master Mar 12, 2022
@aerkiaga aerkiaga deleted the formal-charges branch March 12, 2022 21:42
@aerkiaga aerkiaga restored the formal-charges branch March 12, 2022 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement feature changes / API changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants