You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue in V3000 implicit h toggle causing the the S group total count referenced in memory to be modified, causing a failure in parsing the S group when reading from RDKit as it thinks the file is corrupted.
Issue in V3000 export of M V30 BEGIN COLLECTION block occuring before M V30 BEGIN SGROUP block, causing RDKit failing to parse the section.
Steps to Reproduce
Import the RxN . Export and import to RDKit. See Failures. The test file had been provided under attachment section.
The section of the code above adds new s groups to mol.sgroups object while the execution is still in progress exporting the data, and more importantly, this was added AFTER the following line was already exported to the text file I/O:
output.writeStringCR("M V30 BEGIN CTAB");
output.printfCR("M V30 COUNTS %d %d %d 0 0", mol.vertexCount(), mol.edgeCount(), mol.countSGroups());
output.writeStringCR("M V30 BEGIN ATOM");
So, when implicit H is added, total S group counter written already in an earlier line had become incorrect.
Issue 2
Per this technical manual: https://www.daylight.com/meetings/mug05/Kappler/ctfile.pdf
In page 85 to page 90, the order of BEGIN SGROPU and BEGIN COLLECTION is defined. RDKit follows this strict ordering while Indigo writes it in reverse. It causes toolkit compatibility issue when read by another tool.
The following block of code:
MoleculeStereocenters& stereocenters = mol.stereocenters;
if (stereocenters.begin() != stereocenters.end() || mol.hasHighlighting())
{
output.writeStringCR("M V30 BEGIN COLLECTION");
QS_DEF(Array<int>, processed);
processed.clear_resize(mol.vertexEnd());
processed.zerofill();
for (i = mol.vertexBegin(); i != mol.vertexEnd(); i = mol.vertexNext(i))
{
if (processed[i])
continue;
ArrayOutput out(buf);
int j, type = stereocenters.getType(i);
if (type == MoleculeStereocenters::ATOM_ABS)
out.writeString("MDLV30/STEABS ATOMS=(");
else if (type == MoleculeStereocenters::ATOM_OR)
out.printf("MDLV30/STEREL%d ATOMS=(", stereocenters.getGroup(i));
else if (type == MoleculeStereocenters::ATOM_AND)
out.printf("MDLV30/STERAC%d ATOMS=(", stereocenters.getGroup(i));
else
continue;
QS_DEF(Array<int>, list);
list.clear();
list.push(i);
for (j = mol.vertexNext(i); j < mol.vertexEnd(); j = mol.vertexNext(j))
if (stereocenters.sameGroup(i, j))
{
list.push(j);
processed[j] = 1;
}
out.printf("%d", list.size());
for (j = 0; j < list.size(); j++)
out.printf(" %d", _atom_mapping[list[j]]);
out.writeChar(')');
_writeMultiString(output, buf.ptr(), buf.size());
}
if (mol.hasHighlighting())
{
if (mol.countHighlightedBonds() > 0)
{
ArrayOutput out(buf);
out.printf("MDLV30/HILITE BONDS=(%d", mol.countHighlightedBonds());
for (i = mol.edgeBegin(); i != mol.edgeEnd(); i = mol.edgeNext(i))
if (mol.isBondHighlighted(i))
out.printf(" %d", _bond_mapping[i]);
out.writeChar(')');
_writeMultiString(output, buf.ptr(), buf.size());
}
if (mol.countHighlightedAtoms() > 0)
{
ArrayOutput out(buf);
out.printf("MDLV30/HILITE ATOMS=(%d", mol.countHighlightedAtoms());
for (i = mol.vertexBegin(); i != mol.vertexEnd(); i = mol.vertexNext(i))
if (mol.isAtomHighlighted(i))
out.printf(" %d", _atom_mapping[i]);
out.writeChar(')');
_writeMultiString(output, buf.ptr(), buf.size());
}
}
if (mol.custom_collections.size() > 0)
{
for (i = mol.custom_collections.begin(); i != mol.custom_collections.end(); i = mol.custom_collections.next(i))
{
ArrayOutput out(buf);
out.printf("%s", mol.custom_collections.at(i));
_writeMultiString(output, buf.ptr(), buf.size());
}
}
output.writeStringCR("M V30 END COLLECTION");
}
P.S.: I have confirmed with unit testing code that I made changes in locally tests/tests/molecule.cpp, that the code will export correctly when implicit H is force-disabled and order of export is reversed.
#include<unistd.h>
#include<fstream>
#include"molecule/molfile_saver.h"
#include"reaction/reaction.h"
#include"reaction/reaction_auto_loader.h"
#include"reaction/reaction_automapper.h"TEST_F(IndigoCoreMoleculeTest, Reaction)
{
Reaction reaction;
{
char dir[250];
getcwd(dir, 250);
cout << "Current directory: " << dir << "\n";
ifstream reactionFile("../../data/reactions/other/stereo_reaction.rxn");
string content;
string line;
while (getline(reactionFile, line))
{
content += line;
content.push_back('\n');
}
reactionFile.close();
// cout << "The content is: \n" << content << "\n";
cout << "Loading reaction...";
loadReaction(content.c_str(), reaction);
reaction.aromatize(AromaticityOptions(AromaticityOptions::GENERIC));
ReactionAutomapper ram(reaction);
ram.automap(0);
for (int i = reaction.productBegin(); i < reaction.productEnd(); i = reaction.productNext(i))
{
cout << " *** Current Product Index is: " << i << " *** \n";
Molecule& mol = reaction.getMolecule(i);
string molOutStr;
StringOutput molOut(molOutStr);
MolfileSaver molSaver(molOut);
molSaver.saveMolecule(mol);
cout << molOutStr;
}
}
}
Actually Issue 1 seems to be fixed in 1.20 release. (We were on 1.12 branch, makes sense to upgrade now 👍 )
Issue #2 is still present but it was an easy fix on my side in C++ code.
Summary
Inside molfile_saver.cpp, function: void MolfileSaver::_writeCtab(Output& output, BaseMolecule& mol, bool query):
Issue in V3000 implicit h toggle causing the the S group total count referenced in memory to be modified, causing a failure in parsing the S group when reading from RDKit as it thinks the file is corrupted.
Issue in V3000 export of M V30 BEGIN COLLECTION block occuring before M V30 BEGIN SGROUP block, causing RDKit failing to parse the section.
Steps to Reproduce
Import the RxN . Export and import to RDKit. See Failures. The test file had been provided under attachment section.
Expected behavior/Actual Behavior
Issue 1
First on the issue of improper S group counting:
The section of the code above adds new s groups to mol.sgroups object while the execution is still in progress exporting the data, and more importantly, this was added AFTER the following line was already exported to the text file I/O:
So, when implicit H is added, total S group counter written already in an earlier line had become incorrect.
Issue 2
Per this technical manual:
https://www.daylight.com/meetings/mug05/Kappler/ctfile.pdf
In page 85 to page 90, the order of BEGIN SGROPU and BEGIN COLLECTION is defined. RDKit follows this strict ordering while Indigo writes it in reverse. It causes toolkit compatibility issue when read by another tool.
The following block of code:
I think, should be moved to right AFTER
Attachments
stereo_reaction.zip
If applicable, add attachment files to reproduce the issue.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: