Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd data in PDBBind2016 #19

Open
hnisonoff opened this issue Jul 17, 2022 · 14 comments
Open

Odd data in PDBBind2016 #19

hnisonoff opened this issue Jul 17, 2022 · 14 comments
Assignees

Comments

@hnisonoff
Copy link

I wanted to flag some oddities with the PDBBind2016 dataset. I've tried to recompute the RMSDs and have noticed a very large fraction do not match the data. One particularly odd example I found was in 5c28 where the docked ligand is a different molecule from the crystal ligand. Is there by any chance a cleaner version of the PDBBind docked dataset that could be used?

@francoep
Copy link
Collaborator

The data we utilized is downloaded directly from the PDB at the time of the data creation, we only took the affinity numbers from the PDBbind data and were matching PDB+ligname from the PDBbind to what was identified via pocketome.

Could you give more information about this example (Pocket & specific files)?

@francoep
Copy link
Collaborator

Or did I misunderstand, and you meant exactly the PDBbind2016 data?

@francoep
Copy link
Collaborator

Additionally, looking at the 5c28 example you mentioned, the molecules are the same. Could you provide more examples of faulty data, and also describe how you are trying to re-compute the RMSDs?

@hnisonoff
Copy link
Author

I am describing the 5c28 example from PDBbind2016.tar.gz. I just double check by redownloading everything.

Here is an image showing what happens when I load the docked molecule and the one labeled as ligand. As you can see they are different.

image

@hnisonoff
Copy link
Author

Here is the directory. Files are 5c28_docked.sdf and 5c28_ligand.sdf
5c28.zip

@dkoes
Copy link
Contributor

dkoes commented Jul 27, 2022

Doesn't look different to me. Looks like the same molecule rotated 180 degrees.

@hnisonoff
Copy link
Author

Sorry you are right. I apologize for the inconvenience. For RMSDs I was using rdkit CalcRMS and spyrmsd. I'll go back and fill this out with more detail. Sorry again for the incorrect bug report.

@francoep
Copy link
Collaborator

For the RMSDs that we reported, we used obrms to calculate them (comes when you install openbabel).

@dkoes
Copy link
Contributor

dkoes commented Jul 27, 2022

CalcRMS does not do symmetry correction. spyrmsd is suppose to. I'd be interested in seeing examples where obrms and sprmsd differ.

@hnisonoff
Copy link
Author

I'll try to dig up examples:

For CalcRMS, the documentation says: "Note: This function will attempt to align all permutations of matching atom orders in both molecules"

Doesn't this imply it does symmetry correction?

@dkoes
Copy link
Contributor

dkoes commented Jul 27, 2022

Huh, you're right - that's what the documentation says. Not sure why there is also GetBestRMS then. Maybe historically it CalcRMS didn't do that?

@drewnutt
Copy link
Contributor

GetBestRMS calculates RMSD and then aligns them in space.
CalcRMS calculates the RMSD without moving either molecule.

@JonasLi-19
Copy link

I suppose that 'obrms' also calculates the RMSD without moving either molecule?

@dkoes
Copy link
Contributor

dkoes commented Jul 21, 2023

Yes, unless -m is passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants