Float32 precision and EvalAI submission files for OC22 total energy p…

…redictions (#421) * write tot_e predicts in float32 * adding total_energy=True to base OC22 configs * assert oc22 predictions are fp32 * update to method for writing predictions to keep track of precision * submission file to support oc22 * move energy values to cpu before writing predicts and updated make_submission script * minor fix * minor fix * update to include prediction_dtype flag and remove check in make_submission_file.py * added documentation for the prediction_type flag and oc22 evalai * updated oc22 docs in TRAIN.md and minor changes to make_submission_file.py * add joint training documentation Co-authored-by: Muhammed Shuaibi <mshuaibi@andrew.cmu.edu> Co-authored-by: Muhammed Shuaibi <45150244+mshuaibii@users.noreply.github.com> Co-authored-by: Abhishek Das <das.abhshk@gmail.com>
FAIR-Chem · Jan 24, 2023 · 5a95b3d · 5a95b3d
1 parent 62de708
commit 5a95b3d
Show file tree

Hide file tree

Showing 8 changed files with 170 additions and 50 deletions.
diff --git a/TRAIN.md b/TRAIN.md
@@ -1,13 +1,20 @@
-# Training and evaluating models on the OC20 dataset
+# Training and evaluating models on OCP datasets
 
 - [Getting Started](#getting-started)
-- [Initial Structure to Relaxed Energy (IS2RE)](#initial-structure-to-relaxed-energy-prediction-is2re)
-  - [IS2RE Relaxations](#is2re-relaxations)
-- [Structure to Energy and Forces (S2EF)](#structure-to-energy-and-forces-s2ef)
-- [Initial Structure to Relaxed Structure (IS2RS)](#initial-structure-to-relaxed-structure-is2rs)
-- [Create EvalAI submission files](#create-evalai-submission-files)
-  - [S2EF/IS2RE](#s2efis2re)
-  - [IS2RS](#is2rs)
+- [OC20](#oc20)
+  - [Initial Structure to Relaxed Energy (IS2RE)](#initial-structure-to-relaxed-energy-prediction-is2re)
+    - [IS2RE Relaxations](#is2re-relaxations)
+  - [Structure to Energy and Forces (S2EF)](#structure-to-energy-and-forces-s2ef)
+  - [Initial Structure to Relaxed Structure (IS2RS)](#initial-structure-to-relaxed-structure-is2rs)
+  - [Create EvalAI submission files](#create-evalai-oc20-submission-files)
+    - [S2EF/IS2RE](#s2efis2re)
+    - [IS2RS](#is2rs)
+- [OC22](#oc22)
+  - [Initial Structure to Total Relaxed Energy (IS2RE-Total)](#initial-structure-to-total-relaxed-energy-is2re-total)
+  - [Structure to Total Energy and Forces (S2EF-Total)](#structure-to-total-energy-and-forces-s2ef-total)
+  - [Joint Training](#joint-training)
+  - [Create EvalAI submission files](#create-evalai-oc22-submission-files)
+    - [S2EF-Total/IS2RE-Total](#s2ef-totalis2re-total)
 
 ## Getting Started
 
@@ -58,9 +65,11 @@ python main.py --distributed --num-gpus 8 --num-nodes 6 --submit [...]
 
 In the rest of this tutorial, we explain how to train models for each task.
 
+# OC20
+
 ## Initial Structure to Relaxed Energy prediction (IS2RE)
 
-In the IS2RE tasks, the model takes the initial structure as an input and predicts the structure’s energy
+In the IS2RE tasks, the model takes the initial structure as an input and predicts the structure’s adsorption energy
 in the relaxed state. To train a model for the IS2RE task, you can use the `EnergyTrainer`
 Trainer and `SinglePointLmdb` dataset by specifying the following in your configuration file:
 ```
@@ -133,7 +142,7 @@ Alternatively, the IS2RE task may be approached by 2 methods as described in our
         ```
 ## Structure to Energy and Forces (S2EF)
 
-In the S2EF task, the model takes the positions of the atoms as input and predicts the energy and per-atom
+In the S2EF task, the model takes the positions of the atoms as input and predicts the adsorption energy and per-atom
 forces as calculated by DFT. To train a model for the S2EF task, you can use the `ForcesTrainer` Trainer
 and `TrajectoryLmdb` dataset by specifying the following in your configuration file:
 ```
@@ -199,7 +208,7 @@ python main.py --mode run-relaxations --config-yml configs/s2ef/2M/schnet/schnet
 ```
 The relaxed structure positions are stored in `[RESULTS_DIR]/relaxed_positions.npz` and later used to create a submission file to be uploaded to EvalAI. Predicted trajectories are stored in `trajectories` directory for those interested in analyzing the complete relaxation trajectory.
 
-## Create EvalAI submission files
+## Create EvalAI OC20 submission files
 
 EvalAI expects results to be structured in a specific format for a submission to be successful. A submission must contain results from the 4 different splits - in distribution (id), out of distribution adsorbate (ood ads), out of distribution catalyst (ood cat), and out of distribution adsorbate and catalyst (ood both). Constructing the submission file for each of the above tasks is as follows:
 
@@ -223,3 +232,68 @@ EvalAI expects results to be structured in a specific format for a submission to
     ```
    The final submission file will be written to `is2rs_submission.npz` (rename accordingly).
 3. Upload `is2rs_submission.npz` to EvalAI.
+
+# OC22
+
+## Initial Structure to Total Relaxed Energy (IS2RE-Total)
+
+For the IS2RE-Total task, the model takes the initial structure as input and predicts the total DFT energy of the relaxed structure. This task is more general and more challenging than the original OC20 IS2RE task that predicts adsorption energy. To train an OC22 IS2RE-Total model use the `EnergyTrainer` with the `OC22LmdbDataset` by including these lines in your configuration file:
+
+```
+trainer: energy # Use the EnergyTrainer
+
+task:
+  dataset: oc22_lmdb # Use the OC22LmdbDataset
+  ...
+```
+You can find examples configuration files in [`configs/oc22/is2re`](https://github.com/Open-Catalyst-Project/ocp/tree/main/configs/oc22/is2re).
+
+## Structure to Total Energy and Forces (S2EF-Total)
+
+The S2EF-Total task takes a structure and predicts the total DFT energy and per-atom forces. This differs from the original OC20 S2EF task because it predicts total energy instead of adsorption energy. To train an OC22 S2EF-Total model use the ForcesTrainer with the OC22LmdbDataset by including these lines in your configuration file:
+
+```
+trainer: forces  # Use the ForcesTrainer
+
+task:
+  dataset: oc22_lmdb # Use the OC22LmdbDataset
+  ...
+```
+You can find examples configuration files in [`configs/oc22/s2ef`](https://github.com/Open-Catalyst-Project/ocp/tree/main/configs/oc22/s2ef).
+
+## Joint Training
+
+Training on OC20 total energies whether independently or jointly with OC22 requires `total_energy: True` and a path to the `oc20_ref` (download link provided below) to be specified in the configuration file. These are necessary to convert OC20 adsorption energies into their corresponding total energies. The following changes in the configuration file capture these changes:
+
+```
+task:
+  dataset: oc22_lmdb
+  ...
+  
+dataset:
+  train:
+    src: data/oc20+oc22/s2ef/train
+    normalize_labels: False
+    total_energy: True
+    #download at https://dl.fbaipublicfiles.com/opencatalystproject/data/oc22/oc20_ref.pkl
+    oc20_ref: path/to/oc22_ref.pkl
+  val:
+    src: data/oc22/s2ef/val_id
+    total_energy: True
+    oc20_ref: path/to/oc22_ref.pkl
+```
+
+You can find an example configuration file at [configs/oc22/s2ef/base_joint.yml](https://github.com/Open-Catalyst-Project/ocp/blob/main/configs/oc22/s2ef/base_joint.yml)
+
+## Create EvalAI OC22 submission files
+
+EvalAI expects results to be structured in a specific format for a submission to be successful. A submission must contain results from the 2 different splits - in distribution (id) and out of distribution (ood). Construct submission files for the OC22 S2EF-Total/IS2RE-Total tasks as follows:
+
+### S2EF-Total/IS2RE-Total:
+1. Run predictions `--mode predict` on both the id and ood splits, generating `[s2ef/is2re]_predictions.npz` files for each split.
+2. Run the following command:
+    ```
+    python make_submission_file.py --dataset OC22 --id path/to/id/file.npz --ood path/to/ood_ads/file.npz --out-path submission_file.npz
+    ```
+   Where `file.npz` corresponds to the respective `[s2ef/is2re]_predictions.npz` files generated for the corresponding task. The final submission file will be written to `submission_file.npz` (rename accordingly). The `dataset` argument specifies which dataset is being considered — this only needs to be set for OC22 predictions because OC20 is the default.
+3. Upload `submission_file.npz` to EvalAI.
diff --git a/configs/oc22/is2re/base.yml b/configs/oc22/is2re/base.yml
@@ -4,8 +4,10 @@ dataset:
   train:
     src: data/oc22/is2re/train
     normalize_labels: False
+    total_energy: True
   val:
     src: data/oc22/is2re/val_id
+    total_energy: True
 
 logger: wandb
 

diff --git a/configs/oc22/s2ef/base.yml b/configs/oc22/s2ef/base.yml
@@ -4,8 +4,10 @@ dataset:
   train:
     src: data/oc22/s2ef/train
     normalize_labels: False
+    total_energy: True
   val:
     src: data/oc22/s2ef/val_id
+    total_energy: True
 
 logger: wandb
 
@@ -20,6 +22,7 @@ task:
   grad_input: atomic forces
   train_on_free_atoms: True
   eval_on_free_atoms: True
+  prediction_dtype: float32
 
 optim:
   loss_energy: mae

diff --git a/configs/oc22/s2ef/base_joint.yml b/configs/oc22/s2ef/base_joint.yml
@@ -25,3 +25,4 @@ task:
   grad_input: atomic forces
   train_on_free_atoms: True
   eval_on_free_atoms: True
+  prediction_dtype: float32
diff --git a/configs/s2ef/example.yml b/configs/s2ef/example.yml
@@ -25,6 +25,10 @@ task:
   # These args specify whether to train/eval forces on only free atoms or all.
   train_on_free_atoms: True                                                     # True or False
   eval_on_free_atoms: True                                                      # True or False
+  # By default OC20 s2ef predictions are written in float16 to reduce file size
+  # By default OC22 s2ef predictions are written in float32
+  # If training on total energy use float32
+  prediction_dtype: float16                                                     # 'float16' or 'float32'
   # This is an argument used for checkpoint loading. By default it is True and loads
   # checkpoint as it is. If False, it could partially load the checkpoint without giving
   # any errors

diff --git a/ocpmodels/trainers/energy_trainer.py b/ocpmodels/trainers/energy_trainer.py
@@ -149,7 +149,9 @@ def predict(
                 predictions["id"].extend(
                     [str(i) for i in batch[0].sid.tolist()]
                 )
-                predictions["energy"].extend(out["energy"].tolist())
+                predictions["energy"].extend(
+                    out["energy"].cpu().detach().numpy()
+                )
             else:
                 predictions["energy"] = out["energy"].detach()
                 return predictions

diff --git a/ocpmodels/trainers/forces_trainer.py b/ocpmodels/trainers/forces_trainer.py
@@ -208,14 +208,26 @@ def predict(
                     )
                 ]
                 predictions["id"].extend(systemids)
-                predictions["energy"].extend(
-                    out["energy"].to(torch.float16).tolist()
-                )
                 batch_natoms = torch.cat(
                     [batch.natoms for batch in batch_list]
                 )
                 batch_fixed = torch.cat([batch.fixed for batch in batch_list])
-                forces = out["forces"].cpu().detach().to(torch.float16)
+                # total energy target requires predictions to be saved in float32
+                # default is float16
+                if (
+                    self.config["task"].get("prediction_dtype", "float16")
+                    == "float32"
+                    or self.config["task"]["dataset"] == "oc22_lmdb"
+                ):
+                    predictions["energy"].extend(
+                        out["energy"].cpu().detach().to(torch.float32).numpy()
+                    )
+                    forces = out["forces"].cpu().detach().to(torch.float32)
+                else:
+                    predictions["energy"].extend(
+                        out["energy"].cpu().detach().to(torch.float16).numpy()
+                    )
+                    forces = out["forces"].cpu().detach().to(torch.float16)
                 per_image_forces = torch.split(forces, batch_natoms.tolist())
                 per_image_forces = [
                     force.numpy() for force in per_image_forces

diff --git a/scripts/make_submission_file.py b/scripts/make_submission_file.py
@@ -11,30 +11,36 @@
 
 import numpy as np
 
+SPLITS = {
+    "OC20": ["id", "ood_ads", "ood_cat", "ood_both"],
+    "OC22": ["id", "ood"],
+}
 
-def write_is2re_relaxations(paths, filename, hybrid):
+
+def write_is2re_relaxations(args):
     import ase.io
     from tqdm import tqdm
 
     submission_file = {}
 
-    if not hybrid:
-        for idx, split in enumerate(["id", "ood_ads", "ood_cat", "ood_both"]):
+    if not args.hybrid:
+        for split in SPLITS[args.dataset]:
             ids = []
             energies = []
-            systems = glob.glob(os.path.join(paths[idx], "*.traj"))
+            systems = glob.glob(os.path.join(vars(args)[split], "*.traj"))
             for system in tqdm(systems):
                 sid, _ = os.path.splitext(os.path.basename(system))
                 ids.append(str(sid))
+                # Read the last frame in the ML trajectory. Modify "-1" if you wish to modify which frame to use.
                 traj = ase.io.read(system, "-1")
                 energies.append(traj.get_potential_energy())
 
             submission_file[f"{split}_ids"] = np.array(ids)
             submission_file[f"{split}_energy"] = np.array(energies)
 
     else:
-        for idx, split in enumerate(["id", "ood_ads", "ood_cat", "ood_both"]):
-            preds = np.load(paths[idx])
+        for split in SPLITS[args.dataset]:
+            preds = np.load(vars(args)[split])
             ids = []
             energies = []
             for sid, energy in zip(preds["ids"], preds["energy"]):
@@ -45,54 +51,52 @@ def write_is2re_relaxations(paths, filename, hybrid):
             submission_file[f"{split}_ids"] = np.array(ids)
             submission_file[f"{split}_energy"] = np.array(energies)
 
-    np.savez_compressed(filename, **submission_file)
+    np.savez_compressed(args.out_path, **submission_file)
 
 
-def write_predictions(paths, filename):
-    submission_file = {}
+def write_predictions(args):
+    if args.is2re_relaxations:
+        write_is2re_relaxations(args)
+    else:
+        submission_file = {}
 
-    for idx, split in enumerate(["id", "ood_ads", "ood_cat", "ood_both"]):
-        res = np.load(paths[idx], allow_pickle=True)
-        contents = res.files
-        for i in contents:
-            key = "_".join([split, i])
-            submission_file[key] = res[i]
+        for split in SPLITS[args.dataset]:
+            res = np.load(vars(args)[split], allow_pickle=True)
+            contents = res.files
+            for i in contents:
+                key = "_".join([split, i])
+                submission_file[key] = res[i]
 
-    np.savez_compressed(filename, **submission_file)
+        np.savez_compressed(args.out_path, **submission_file)
 
 
 def main(args):
-    id_path = args.id
-    ood_ads_path = args.ood_ads
-    ood_cat_path = args.ood_cat
-    ood_both_path = args.ood_both
+    for split in SPLITS[args.dataset]:
+        assert vars(args).get(
+            split
+        ), f"Missing {split} split for {args.dataset}"
 
-    paths = [id_path, ood_ads_path, ood_cat_path, ood_both_path]
     if not args.out_path.endswith(".npz"):
         args.out_path = args.out_path + ".npz"
 
-    if not args.is2re_relaxations:
-        write_predictions(paths, filename=args.out_path)
-    else:
-        write_is2re_relaxations(
-            paths, filename=args.out_path, hybrid=args.hybrid
-        )
+    write_predictions(args)
     print(f"Results saved to {args.out_path} successfully.")
 
 
 if __name__ == "__main__":
     """
     Create a submission file for evalAI. Ensure that for the task you are
-    submitting for you have generated results files on each of the 4 splits -
-    id, ood_ads, ood_cat, ood_both.
+    submitting for you have generated results files on each of the splits:
+        OC20: id, ood_ads, ood_cat, ood_both
+        OC22: id, ood
 
     Results file can be obtained as follows for the various tasks:
 
     S2EF: config["mode"] = "predict"
     IS2RE: config["mode"] = "predict"
     IS2RS: config["mode"] = "run-relaxations" and config["task"]["write_pos"] = True
 
-    Use this script to join the 4 results files in the format evalAI expects
+    Use this script to join the results files (4 for OC20, 2 for OC22) in the format evalAI expects
     submissions.
 
     If writing IS2RE predictions from relaxations, paths must be directories
@@ -106,10 +110,21 @@ def main(args):
     """
 
     parser = argparse.ArgumentParser()
-    parser.add_argument("--id", help="Path to ID results")
-    parser.add_argument("--ood-ads", help="Path to OOD-Ads results")
-    parser.add_argument("--ood-cat", help="Path to OOD-Cat results")
-    parser.add_argument("--ood-both", help="Path to OOD-Both results")
+    parser.add_argument(
+        "--id", help="Path to ID results. Required for OC20 and OC22."
+    )
+    parser.add_argument(
+        "--ood-ads", help="Path to OOD-Ads results. Required only for OC20."
+    )
+    parser.add_argument(
+        "--ood-cat", help="Path to OOD-Cat results. Required only for OC20."
+    )
+    parser.add_argument(
+        "--ood-both", help="Path to OOD-Both results. Required only for OC20."
+    )
+    parser.add_argument(
+        "--ood", help="Path to OOD OC22 results. Required only for OC22."
+    )
     parser.add_argument("--out-path", help="Path to write predictions to.")
     parser.add_argument(
         "--is2re-relaxations",
@@ -121,6 +136,13 @@ def main(args):
         action="store_true",
         help="Write IS2RE results from S2EF prediction files. Paths specified correspond to S2EF NPZ files.",
     )
+    parser.add_argument(
+        "--dataset",
+        type=str,
+        default="OC20",
+        choices=["OC20", "OC22"],
+        help="Which dataset to write a prediction file for, OC20 or OC22.",
+    )
 
     args = parser.parse_args()
     main(args)