Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More molecule fixes and helper reactions for checking molecules / reactions #75

Merged
merged 33 commits into from
Nov 6, 2023

Conversation

xiaoruiDong
Copy link
Owner

This again is inspired by the data cleaning work of reaction datasets. This PR covers:

  1. fixing more nitrogenated (or more complicated) molecules by providing a feasible recipe and adding more unit tests
  2. adding "saturate radicals": RDKitMol.GetClosedShellMol
  3. adding HasSameConnectivity to check the connectivities of molecules with the same atom mapping
  4. making RenumberAtoms to allow dict as a "shorter" input.
  5. adding a way to check if two reactions are equivalent regardless of the atom mapping.

Added a few more examples from analyzing the QuantumPioneer data
update property cache is necessary if sanitize=False
1. also make the warning message for saturate mol silent
HasSameConnectivity is a helper function to compare two molecule's adjacency matrix. Rename the one comparing conformer connectivity to HasSameConnectivityConformer; GetClosedShellMol is a helper function to get the closed shell form of a radical
This provides extra benefits in terms of implementation. E.g., one only wants to swap the index of two atoms can now do mol.RenumberAtoms({0:2, 2:0}) instead of provide the full list.
The current Chem.MolFromSmiles will silently generate a None object if the smiles is not valid (e.g., handwritten), and the None object will raise Attribute Error in the following steps. It is more insightful to raise a valueError instead to indicate the SMILES is not valid.
This method uses the default substructure match and create a recipe. The recipe is used to transform the provided mol to the current mol
Previously use "GetSubStruct..." which is different from the convention of the existing ones.. Change it to "GetSubstruct..."
Copy link

codecov bot commented Nov 1, 2023

Codecov Report

Attention: 95 lines in your changes are missing coverage. Please review.

Comparison is base (cc64388) 36.15% compared to head (aebc2ae) 38.24%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #75      +/-   ##
==========================================
+ Coverage   36.15%   38.24%   +2.09%     
==========================================
  Files          34       34              
  Lines        3684     3812     +128     
  Branches      942      975      +33     
==========================================
+ Hits         1332     1458     +126     
+ Misses       2278     2275       -3     
- Partials       74       79       +5     
Files Coverage Δ
rdmc/external/logparser/base.py 22.75% <0.00%> (ø)
rdmc/resonance/resonance.py 82.53% <33.33%> (-2.47%) ⬇️
rdmc/mol_compare.py 47.50% <66.66%> (+6.82%) ⬆️
rdmc/reaction.py 66.81% <52.94%> (+8.40%) ⬆️
rdmc/fix.py 82.97% <77.50%> (+2.27%) ⬆️
rdmc/utils.py 49.51% <74.39%> (+5.54%) ⬆️
rdmc/mol.py 66.33% <72.67%> (+0.92%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

There are cases ResonanceMolSupplier returns None, causing error in downstream. This commit makes sure, if nothing meaningful is created, at least the mol created previously will be returned.
Openbabel and Jensen XYZ perception algorithms do not perceive oxonium oxygen correctly. This commit try to fix it by adding the missing bonds and correcting the charge
1. Add an optional argument to control the threshold of missing bond perception
2. Add an optional argument to allow not sanitization for debugging and testing
3. Add an unit test for fix_oxonium_bonds
1. Add the `bothway` argument to avoid mistakenly changing the geometry sequence.
2. Allow passing additional arguments to the viewer
For unknown reason, it may yield None object unexpectedly. The commit add a filter for its output.
@xiaoruiDong xiaoruiDong merged commit 1c42c87 into main Nov 6, 2023
28 checks passed
@xiaoruiDong xiaoruiDong deleted the fix_mol branch March 21, 2024 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant