Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic unit conversion for experiments #11

Merged
merged 8 commits into from
Oct 21, 2023
Merged

Basic unit conversion for experiments #11

merged 8 commits into from
Oct 21, 2023

Conversation

FNTwin
Copy link
Collaborator

@FNTwin FNTwin commented Oct 16, 2023

Unit conversion for handling the experiments. Opening the PR while I clean the API and add some tests
It will take care of #9

TODO

  • Tests
  • Some cleaning of the API

Implement a registry of the most common unit conversions (units.py) and modify the constructor for the datasets to keep track of the current units and provide a way to change them.

@FNTwin FNTwin requested a review from prtos as a code owner October 16, 2023 18:48
"pbe-ts",
"ccsd/cc-pvdz",
"ccsd(t)/cc-pvdz",
# "pbe+mbd/light", #MD22
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not using those methods should we?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends if by GDML we are talking about MD17, MD22 or both. Both of them are calculated from AIMD with FHI-aims with All Electron DFT. MD17 uses the pbe functional with vdw-ts, while MD22 uses pbe+mbd with different settings per system.
I'm going to calculate the isolated atoms energy for ccsd/cc-pvdz, ccsd(t)/cc-pvdz and I'm creating as we speak the pseudopotential for the pbe functional for the pbe+vdw-ts level of theory on DFT PAW. We would have issue on the many body dispersion as I can't use the correct convergence light/tight so I think just using the pbe+vdw-ts should be good enough

@@ -59,6 +59,10 @@ class GEOM(BaseDataset):
__name__ = "geom"
__energy_methods__ = ["gfn2_xtb"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FNTwin Do still need a basis set for semi-empirical methods

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't to specify a basis set for semi empirical method as they are fully defined by their name. You don't get to choose the basis set as it is "builded inside" the technique itself.

@@ -46,15 +46,19 @@ def _read_sdf(sdf_path, properties_path):

class Molecule3D(BaseDataset):
__name__ = "molecule3d"
__energy_methods__ = ["b3lyp_6-31g*"]
__energy_methods__ = ["b3lyp/6-31g*"]
# UNITS MOST LIKELY WRONG, MUST CHECK THEM MANUALLY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FNTwin is this already checked?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I need to download the dataset and run some calculation and manually see what unit they are using as they don't provide any info. This PR should be just to have a quick method to interface openMLIP and convert the datasets between units for experimenting. A more heavy PR with every unit checked and validated will be with the calculated isolated atom energy


energy_target_names = ["ePBE0", "eMBD"]

__force_methods__ = ["pbe-ts", "vdw"]
__force_methods__ = ["pbe+vdw-ts", "mbd"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FNTwin Are you sure the second force is mbd or just vdw forces?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake, corrected. It is a semi empirical tight binding with a many body dispersion correlation and a pbe0 for the dft calculation

@@ -51,13 +53,13 @@ class WaterClusters(BaseDataset):

# Energy in hartree, all zeros by default
atomic_energies = np.zeros((MAX_ATOMIC_NUMBER,), dtype=np.float32)

# need to know where to find the data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so you downloaded the entire dataset made from the flexible potential. WIth the isolated atom energy PR I'll have a manual check of the data as with the other unsure dataset

@prtos prtos merged commit a86f884 into main Oct 21, 2023
@prtos prtos deleted the units branch October 21, 2023 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants