-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic unit conversion for experiments #11
Conversation
"pbe-ts", | ||
"ccsd/cc-pvdz", | ||
"ccsd(t)/cc-pvdz", | ||
# "pbe+mbd/light", #MD22 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not using those methods should we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends if by GDML we are talking about MD17, MD22 or both. Both of them are calculated from AIMD with FHI-aims with All Electron DFT. MD17 uses the pbe functional with vdw-ts, while MD22 uses pbe+mbd with different settings per system.
I'm going to calculate the isolated atoms energy for ccsd/cc-pvdz, ccsd(t)/cc-pvdz and I'm creating as we speak the pseudopotential for the pbe functional for the pbe+vdw-ts level of theory on DFT PAW. We would have issue on the many body dispersion as I can't use the correct convergence light/tight so I think just using the pbe+vdw-ts should be good enough
@@ -59,6 +59,10 @@ class GEOM(BaseDataset): | |||
__name__ = "geom" | |||
__energy_methods__ = ["gfn2_xtb"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FNTwin Do still need a basis set for semi-empirical methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't to specify a basis set for semi empirical method as they are fully defined by their name. You don't get to choose the basis set as it is "builded inside" the technique itself.
@@ -46,15 +46,19 @@ def _read_sdf(sdf_path, properties_path): | |||
|
|||
class Molecule3D(BaseDataset): | |||
__name__ = "molecule3d" | |||
__energy_methods__ = ["b3lyp_6-31g*"] | |||
__energy_methods__ = ["b3lyp/6-31g*"] | |||
# UNITS MOST LIKELY WRONG, MUST CHECK THEM MANUALLY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FNTwin is this already checked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I need to download the dataset and run some calculation and manually see what unit they are using as they don't provide any info. This PR should be just to have a quick method to interface openMLIP and convert the datasets between units for experimenting. A more heavy PR with every unit checked and validated will be with the calculated isolated atom energy
src/openqdc/datasets/qm7x.py
Outdated
|
||
energy_target_names = ["ePBE0", "eMBD"] | ||
|
||
__force_methods__ = ["pbe-ts", "vdw"] | ||
__force_methods__ = ["pbe+vdw-ts", "mbd"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FNTwin Are you sure the second force is mbd or just vdw forces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, corrected. It is a semi empirical tight binding with a many body dispersion correlation and a pbe0 for the dft calculation
@@ -51,13 +53,13 @@ class WaterClusters(BaseDataset): | |||
|
|||
# Energy in hartree, all zeros by default | |||
atomic_energies = np.zeros((MAX_ATOMIC_NUMBER,), dtype=np.float32) | |||
|
|||
# need to know where to find the data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so you downloaded the entire dataset made from the flexible potential. WIth the isolated atom energy PR I'll have a manual check of the data as with the other unsure dataset
Unit conversion for handling the experiments. Opening the PR while I clean the API and add some tests
It will take care of #9
TODO
Implement a registry of the most common unit conversions (units.py) and modify the constructor for the datasets to keep track of the current units and provide a way to change them.