Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenFF Lipid Optimization Benchmark Supplement v1.0 #399

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ntBre
Copy link
Collaborator

@ntBre ntBre commented Oct 30, 2024

This is the benchmarking counterpart to #394, constructed from full molecules in the LIPID MAPS database. I think this is generally ready to go (besides updating the main README), but I'm leaving it as a draft because @j-wags and I discussed a bit about possibly partitioning datasets based on molecule size to make requesting compute resources of a certain size easier. We weren't sure exactly where to draw that cutoff, but this set ranges in size from 6 atoms to 106 atoms, so I thought it might be worth discussing. (Edit: the splitting stuff is resolved by the new tagging mechanism in #412, so this should be good to go).

New Submission Checklist

  • Created a new folder in the submissions directory containing the dataset
  • Added README.md describing the dataset see here for examples
  • All files used to produce the dataset are included with a description
  • Dataset follows the QCSubmit schema defined for Datasets, OptimizationDatasets and TorsionDriveDatasets
  • Dataset filename matches pattern dataset*.json; may feature a compression extension, such as .bz2
  • A PDF depicting the molecules is attached, in the case of torsiondrives this should include the highlighting of the central bond, this can be done automatically using qcsubmit.
  • QCSubmit validation passed
  • Made a new dataset entry in the mapping table in repository README.md
  • Ready to submit!

@openff-dangerbot
Copy link
Contributor

QCSubmit Validation Report

submissions/2024-10-30-OpenFF-Lipid-Optimization-Benchmark-Supplement-v1.0/dataset.json.bz2
Dataset Name OpenFF Lipid Optimization Benchmark Supplement v1.0
Dataset Type OptimizationDataset
Elements O ,H ,C ,Br ,P ,N ,Cl ,F ,S ,I
Valid Cmiles 🔥
Connected Dihedrals 🔥
No Linear Torsions 🔥
No Molecular Complexes 🔥
Valid Constraints 🔥
Complete Metatdata 🔥

QC Specification Report

submissions/2024-10-30-OpenFF-Lipid-Optimization-Benchmark-Supplement-v1.0/dataset.json.bz2/default
Specification Name default
Method B3LYP-D3BJ
Basis DZVP
Wavefunction Protocol none
Implicit Solvent
Keywords {}
Validated 🔥
Valid SCF Properties 🔥
Full Basis Coverage 🔥
QCSubmit version information(click to expand)
version
openff.qcsubmit 0.53.0
openff.toolkit 0.16.5
basis_set_exchange 0.10
qcelemental 0.28.0
rdkit 2024.09.2

@openff-dangerbot
Copy link
Contributor

QCSubmit Validation Report

submissions/2024-10-30-OpenFF-Lipid-Optimization-Benchmark-Supplement-v1.0/dataset.json.bz2
Dataset Name OpenFF Lipid Optimization Benchmark Supplement v1.0
Dataset Type OptimizationDataset
Elements O ,H ,C ,Br ,P ,N ,Cl ,F ,S ,I
Valid Cmiles 🔥
Connected Dihedrals 🔥
No Linear Torsions 🔥
No Molecular Complexes 🔥
Valid Constraints 🔥
Complete Metatdata 🔥

QC Specification Report

submissions/2024-10-30-OpenFF-Lipid-Optimization-Benchmark-Supplement-v1.0/dataset.json.bz2/default
Specification Name default
Method B3LYP-D3BJ
Basis DZVP
Wavefunction Protocol none
Implicit Solvent
Keywords {}
Validated 🔥
Valid SCF Properties 🔥
Full Basis Coverage 🔥
QCSubmit version information(click to expand)
version
openff.qcsubmit 0.54.0
openff.toolkit 0.16.6
basis_set_exchange 0.10
qcelemental 0.28.0
rdkit 2024.09.3

@ntBre ntBre marked this pull request as ready for review December 2, 2024 21:47
@ntBre
Copy link
Collaborator Author

ntBre commented Dec 2, 2024

Marking this ready for review now that #412 is in, and we don't need to worry about splitting it.

@ntBre ntBre requested a review from lilyminium December 2, 2024 21:47
Copy link
Contributor

@lilyminium lilyminium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note about the threshold -- otherwise LGTM. Looking forward to see how the split workers go!

@openff-dangerbot
Copy link
Contributor

QCSubmit Validation Report

submissions/2024-10-30-OpenFF-Lipid-Optimization-Benchmark-Supplement-v1.0/dataset.json.bz2
Dataset Name OpenFF Lipid Optimization Benchmark Supplement v1.0
Dataset Type OptimizationDataset
Elements O ,H ,C ,Br ,P ,N ,Cl ,F ,S ,I
Valid Cmiles 🔥
Connected Dihedrals 🔥
No Linear Torsions 🔥
No Molecular Complexes 🔥
Valid Constraints 🔥
Complete Metatdata 🔥

QC Specification Report

submissions/2024-10-30-OpenFF-Lipid-Optimization-Benchmark-Supplement-v1.0/dataset.json.bz2/default
Specification Name default
Method B3LYP-D3BJ
Basis DZVP
Wavefunction Protocol none
Implicit Solvent
Keywords {}
Validated 🔥
Valid SCF Properties 🔥
Full Basis Coverage 🔥
QCSubmit version information(click to expand)
version
openff.qcsubmit 0.54.0
openff.toolkit 0.16.6
basis_set_exchange 0.10
qcelemental 0.28.0
rdkit 2024.09.3

@lilyminium
Copy link
Contributor

LGTM thank you Brent! I'll let you merge so you can figure out compute tags?

@lilyminium
Copy link
Contributor

lilyminium commented Dec 4, 2024

  File "/home/runner/work/qca-dataset-submission/qca-dataset-submission/./management/lifecycle.py", line 16, in <module>
    from qcelemental.models import Molecule
ModuleNotFoundError: No module named 'qcelemental'

Ah oops -- probably from #412. Surprised it's not pulled in by qcportal!

Edit: actually it's the backlog environment and script, I'll open a PR to only import on type checking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Backlog
Development

Successfully merging this pull request may close these issues.

3 participants