-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extended SMILES saved from Ketcher might be invalid for RDKit #1865
Comments
yes, aromatic bonds were not converted correctly for double bond Oxygen and aromaticity is kept as atom - which should not be a case (it should be converted to ":" bonds). It is suggested to un-aromatize such structures |
@AlexanderSavelyev - I have a report from users that led me to this issue. Could you confirm that it is the same? Load
|
A colleague helped me with the reasoning in terms of chemistry :)
|
BTW, no need to use Extended SMILES, simple Daylight is enough (perhaps, it is good idea to correct the issue title). With the OP's input:
|
Need to switch to indigo for smiles generation |
The bug appears because interchange KET-format doesn't support explicit implicit hydrogens count which can be specified in bracketed SMILES atoms as a virtual hydrogens counter. Typically it's not an issue but there are special cases when the standard valence model fails to determine the number of suppressed hydrogens. For instance In the example above, N-atom is connected to aromatic ring, so the automatic hydrogen counting is not possible. To avoid the ambiguousness [nH] explicitly specifies the number of implicit hydrogens = 1 for the nitrogen atom. To fix the issue on the ketcher's side: 1) add implicitHCount field to the atom entity of the ket-format json schema:
2) As ketcher has own parser/generator of MOL V2000, corresponding conversion of virtual hydrogens counter ImplicitHCount to the "chemaxon style" Data S-Group should be implemented. I.e. if a MOL V2000 file has a data group as below: M STY 1 1 DAT it should be converted to an atom's property implicitHCount and for generating of MOL V2000 the data S-Groups should be added basing on the implicitHCount value. Some info about MRV_IMPLICIT_H data s-group: 3) In editing mode when a heteroatom connects to an aromatic ring it's necessary to add a ImplicitHCount property to this atom to specify the number of hydrogens on it. |
Functionality for supporting implicit hydrogens for mol v2000 format will be implemented separately as part of #2500 |
Steps to Reproduce
c1cccc(-c2ccc(Nc3cccc4c(=O)[nH]ccc34)nc2)c1
Extended SMILES
, and the result would beC1C=C(C2C=NC(Nc3c4c(c(ncc4)=O)ccc3)=CC=2)C=CC=1
Actual behavior
mol.is_valid()
in RDKit.js website is false.Expected behavior
Smiles generated by Ketcher should all be valid to RDKit?
Ketcher version .
2.6.2
The text was updated successfully, but these errors were encountered: