Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dataset request] VibML #59

Open
1 task
gpwolfe opened this issue Oct 8, 2024 · 0 comments
Open
1 task

[Dataset request] VibML #59

gpwolfe opened this issue Oct 8, 2024 · 0 comments

Comments

@gpwolfe
Copy link
Collaborator

gpwolfe commented Oct 8, 2024

Name

Gregory Wolfe

Email

gw2338@nyu.edu

Dataset name

VibML

Authors

Silvan Käser, Eric Boittier, Meenu Upadhyay, Markus Meuwly

Publication link

https://doi.org/10.48550/arXiv.2103.05491

Data link

https://zenodo.org/records/4585449

Additional links

https://doi.org/10.1021/acs.jctc.1c00249

Dataset description

From the Zenodo description:
The deposited data sets were used to obtain representations
of potential energy surfaces (PESs) for eight representative
molecules using a neural network of the PhysNet type [1]. The
molecules under investigation are H2CO, trans-HONO, HCOOH,
CH3OH, CH3CHO, CH3NO2, CH3COOH and CH3CONH2.

Reference data calculated at three different levels of quantum
chemical theory (MP2/aug-cc-pVTZ, CCSD(T)/aug-cc-pVTZ and
CCSD(T)-F12/aug-cc-pVTZ-F12) was used to train machine learning
(ML) models. Data sets at the MP2 level of theory were generated
for all molecules, at CCSD(T) level they were generated for
molecules with less than 7 atoms, and data sets at the CCSD(T)-F12
level of theory were generated for molecules with less than 6
atoms. The data sets contain different geometries for each
molecule generated using the normal mode sampling approach [2]
performed at different temperatures. The ab initio calculations
were performed using MOLPRO [3].

The performance of the PhysNet is then examined by considering
out-of-sample energy and force errors, harmonic frequencies in
comparison to explicit ab initio calculations and anharmonic
frequencies (obtained from a second order vibrational perturbation
theory (VPT2) analysis [4] as implemented in the Gaussian 09
suite [5]) in comparison to ab initio VPT2 calculations at the
MP2 level as well as to experiment.

From the publication:
Datasets at three levels of theory, including MP2/aug-cc-pVTZ, (16,17) CCSD(T)/aug-cc-pVTZ, (17−19) and CCSD(T)-F12/aug-cc-pVTZ-F12 (20,21) (referred to as “MP2”, “CCSD(T)”, and “CCSD(T)-F12” for convenience in the following), were generated. All single-point electronic structure calculations, including energies, forces, and dipole moments required for ML, as well as harmonic frequency calculations, were performed using MOLPRO. (22) Datasets at the MP2 level of theory were generated for all molecules; for CCSD(T), they were generated for molecules with Natom ≤ 6, and datasets at the CCSD(T)-F12 level of theory were generated for molecules with Natom ≤ 5.

File details

npz files: See implementation for aleatoric epistemic error dataset

Method

CCSD(T)

Method (other)

MP2

Software

Other (provide software below)

Software (other)

MOLPRO

Software version(s)

No response

Additional details

No response

Property types

Potential energy

Energy field conjugate with forces

No response

Other/additional property

No response

Property details

not known

Elements

C,H,O,N

Number of Configurations

No response

Naming convention

file names can be keyed for method, molecule and basis set (ess. MP2, CCSD(T) and CCSD(T)-F12

Configuration sets

No response

Configuration labels

No response

Distribution license

CC-BY-4.0

Permissions

  • I confirm that I have the necessary permissions to submit this dataset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant