Skip to content

Latest commit

 

History

History
314 lines (264 loc) · 14.6 KB

CHANGELOG.md

File metadata and controls

314 lines (264 loc) · 14.6 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Most recent change on the bottom.

Unreleased

[0.6.1] - 2024-7-9

Added

  • add support for equivariance testing of arbitrary Cartesian tensor outputs
  • [Breaking] use entry points for nequip.extensions (e.g. for field registration)
  • alternate neighborlist support enabled with NEQUIP_NL environment variable, which can be set to ase (default), matscipy or vesin
  • Allow n_train and n_val to be specified as percentages of datasets.
  • Only attempt training restart if trainer.pth file present (prevents unnecessary crashes due to file-not-found errors in some cases)

Changed

  • [Breaking] NEQUIP_MATSCIPY_NL environment variable no longer supported

Fixed

  • Fixed flake8 install location in pre-commit-config.yaml

[0.6.0] - 2024-5-10

Added

  • add Tensorboard as logger option
  • [Breaking] Refactor overall model logic into GraphModel top-level module
  • [Breaking] Added model_dtype
  • BATCH_PTR_KEY in AtomicDataDict
  • AtomicInMemoryDataset.rdf() and examples/rdf.py
  • type_to_chemical_symbol
  • Pair potential terms
  • nequip-evaluate --output-fields-from-original-dataset
  • Error (or warn) on unused options in YAML that likely indicate typos
  • dataset_*_absmax statistics option
  • HDF5Dataset (#227)
  • include_file_as_baseline_config for simple modifications of existing configs
  • nequip-deploy --using-dataset to support data-dependent deployment steps
  • Support for Gaussian Mixture Model uncertainty quantification (https://doi.org/10.1063/5.0136574)
  • start_of_epoch_callbacks
  • nequip.train.callbacks.loss_schedule.SimpleLossSchedule for changing the loss coefficients at specified epochs
  • nequip-deploy build --checkpoint and --override to avoid many largely duplicated YAML files
  • matscipy neighborlist support enabled with NEQUIP_MATSCIPY_NL environment variable

Changed

  • Always require explicit seed
  • [Breaking] Set dataset_seed to seed if it is not explicitly provided
  • Don't log as often by default
  • [Breaking] Default nonlinearities are silu (e) and tanh (o)
  • Will not reproduce previous versions' data shuffling order (for all practical purposes this does not matter, the shuffle option is unchanged)
  • [Breaking] default_dtype defaults to float64 (model_dtype default float32, allow_tf32: true by default--- see https://arxiv.org/abs/2304.10061)
  • nequip-benchmark now only uses --n-data frames to build the model
  • [Breaking] By default models now use StressForceOutput, not ForceOutput
  • Added edge_energy to ALL_ENERGY_KEYS subjecting it to global rescale

Fixed

  • Work with wandb>=0.13.8
  • Better error for standard deviation with too few data
  • load_model_state GPU -> CPU
  • No negative volumes in rare cases

Removed

  • [Breaking] fixed_fields machinery (npz_fixed_field_keys is still supported, but through a more straightforward implementation)
  • Default run name/WandB project name of NequIP, they must now always be provided explicitly
  • [Breaking] Removed _params as an allowable subconfiguration suffix (i.e. instead of optimizer_params now only optimizer_kwargs is valid, not both)
  • [Breaking] Removed per_species_rescale_arguments_in_dataset_units

[0.5.6] - 2022-12-19

Added

  • sklearn dependency removed
  • nequip-benchmark and nequip-train report number of weights and number of trainable weights
  • nequip-benchmark --no-compile and --verbose and --memory-summary
  • nequip-benchmark --pdb for debugging model (builder) errors
  • More information in nequip-deploy info
  • GPU OOM offloading mode

Changed

  • Minimum e3nn is now 0.4.4
  • --equivariance-test now prints much more information, especially when there is a failure

Fixed

  • Git utilities when installed as ZIPed .egg (#264)

[0.5.5] - 2022-06-20

Added

  • BETA! Support for stress in training and inference
  • EMTTestDataset for quick synthetic fake PBC data
  • multiprocessing for ASE dataset loading/processing
  • nequip-benchmark times dataset loading, model creation, and compilation
  • validation_batch_size
  • support multiple metrics on same field with different functionals
  • allow custom metrics names
  • allow e3nn==0.5.0
  • --verbose option to nequip-deploy
  • print data statistics in nequip-benchmark
  • normalized_sum reduction in AtomwiseReduce

Changed

  • abbreviate node_features->h in loss titles
  • failure of permutation equivariance tests no longer short-circuts o3 equivariance tests
  • NequIPCalculator now stores all relevant properties computed by the model regardless of requested properties, and does not try to access those not computed by the model, allowing models that only compute energy or forces but not both

Fixed

  • Equivariance testing correctly handles output cells
  • Equivariance testing correctly handles one-node or one-edge data
  • report_init_validation now runs on validation set instead of training set
  • crash when unable to find os.sched_getaffinity on some systems
  • don't incorrectly log per-species scales/shifts when loading model (such as for deployment)
  • nequip-benchmark now picks data frames deterministically
  • useful error message for metrics_key: training_* with report_init_validation: True (#213)

[0.5.4] - 2022-04-12

Added

  • NequIPCalculator now handles per-atom energies
  • Added initial_model_state_strict YAML option
  • load_model_state builder
  • fusion strategy support
  • cumulative_wall for early stopping
  • Deploy model from YAML file directly

Changed

  • Disallow PyTorch 1.9, which has some JIT bugs.
  • nequip-deploy build now requires --train-dir option when specifying the training session
  • Minimum Python version is now 3.7

Fixed

  • Better error in Dataset.statistics when field is missing
  • NequIPCalculator now outputs energy as scalar rather than (1, 1) array
  • dataset: ase now treats automatically adds key_mapping keys to include_keys, which is consistant with the npz dataset
  • fixed reloading models with per_species_rescale_scales/shifts set to null/None
  • graceful exit for -n 0 in nequip-benchmark
  • Strictly correct CSV headers for metrics (#198)

[0.5.3] - 2022-02-23

Added

  • nequip-evaluate --repeat option
  • Report number of weights to wandb

Changed

  • defaults and commments in example.yaml and full.yaml, in particular longer default training and correct comment for E:F-weighting
  • better metrics config in example.yaml and full.yaml, in particular will total F-MAE/F-RMSE instead of mean over per-species
  • default value for report_init_validation is now True
  • all_*_* metrics rename to -> psavg_*_*
  • avg_num_neighbors default None -> auto

Fixed

  • error if both per-species and global shift are used together

[0.5.2] - 2022-02-04

Added

  • Model builders may now process only the configuration
  • Allow irreps to optionally be specified through the simplified keys l_max, parity, and num_features
  • wandb.watch via wandb_watch option
  • Allow polynomial cutoff p values besides 6.0
  • nequip-evaluate now sets a default r_max taken from the model for the dataset config
  • Support multiple rescale layers in trainer
  • AtomicData.to_ase supports arbitrary fields
  • nequip-evaluate can now output arbitrary fields to an XYZ file
  • nequip-evaluate reports which frame in the original dataset was used as input for each output frame

Changed

  • minimal.yaml, minimal_eng.yaml, and example.yaml now use the simplified irreps options l_max, parity, and num_features
  • Default value for resnet is now False

Fixed

  • Handle one of per_species_shifts/scales being null when the other is a dataset statistc
  • include_frames now works with ASE datasets
  • no training data labels in input_data
  • Average number of neighbors no longer crashes sometimes when not all nodes have neighbors (small cutoffs)
  • Handle field registrations correctly in nequip-evaluate

Removed

  • compile_model

[0.5.1] - 2022-01-13

Added

  • NequIPCalculator can now be built via a nequip_calculator() function. This adds a minimal compatibility with vibes
  • Added avg_num_neighbors: auto option
  • Asynchronous IO: during training, models are written asynchronously. Enable this with environment variable NEQUIP_ASYNC_IO=true.
  • dataset_seed to separately control randomness used to select training data (and their order).
  • The types may now be specified with a simpler chemical_symbols option
  • Equivariance testing reports per-field errors
  • --equivariance-test n tests equivariance on n frames from the training dataset

Changed

  • All fields now have consistant [N, dim] shaping
  • Changed default seed and dataset_seed in example YAMLs
  • Equivariance testing can only use training frames now

Fixed

  • Equivariance testing no longer unintentionally skips translation
  • Correct cat dim for all registered per-graph fields
  • PerSpeciesScaleShift now correctly outputs when scales, but not shifts, are enabled— previously it was broken and would only output updated values when both were enabled.
  • nequip-evaluate outputs correct species to the extxyz file when a chemical symbol <-> type mapping exists for the test dataset

[0.5.0] - 2021-11-24

Changed

  • Allow e3nn 0.4.*, which changes the default normalization of TensorProducts; this change should not affect typical NequIP networks
  • Deployed are now frozen on load, rather than compile

Fixed

  • load_deployed_model respects global JIT settings

[0.4.0] - not released

Added

  • Support for e3nn's soft_one_hot_linspace as radial bases
  • Support for parallel dataloader workers with dataloader_num_workers
  • Optionally independently configure validation and training datasets
  • Save dataset parameters along with processed data
  • Gradient clipping
  • Arbitrary atom type support
  • Unified, modular model building and initialization architecture
  • Added nequip-benchmark script for benchmarking and profiling models
  • Add before option to SequentialGraphNetwork.insert
  • Normalize total energy loss by the number of atoms via PerAtomLoss
  • Model builder to initialize training from previous checkpoint
  • Better error when instantiation fails
  • Rename npz_keys to include_keys
  • Allow user to register graph_fields, node_fields, and edge_fields via yaml
  • Deployed models save the e3nn and torch versions they were created with

Changed

  • Update example.yaml to use wandb by default, to only use 100 epochs of training, to set a very large batch logging frequency and to change Validation_loss to validation_loss
  • Name processed datasets based on a hash of their parameters to ensure only valid cached data is used
  • Do not use TensorFloat32 by default on Ampere GPUs until we understand it better
  • No atomic numbers in networks
  • dataset_energy_std/dataset_energy_mean to dataset_total_energy_*
  • nequip.dynamics -> nequip.ase
  • update example.yaml and full.yaml with better defaults, new loss function, and switched to toluene-ccsd(t) as example data
  • use_sc defaults to True
  • register_fields is now in nequip.data
  • Default total energy scaling is changed from global mode to per species mode.
  • Renamed trainable_global_rescale_scale to global_rescale_scale_trainble
  • Renamed trainable_global_rescale_shift to global_rescale_shift_trainble
  • Renamed PerSpeciesScaleShift_ to per_species_rescale
  • Change default and allowed values of metrics_key from loss to validation_loss. The old default loss will no longer be accepted.
  • Renamed per_species_rescale_trainable to per_species_rescale_scales_trainable and per_species_rescale_shifts_trainable

Fixed

  • The first 20 epochs/calls of inference are no longer painfully slow for recompilation
  • Set global options like TF32, dtype in nequip-evaluate
  • Avoid possilbe race condition in caching of processed datasets across multiple training runs

Removed

  • Removed allowed_species
  • Removed --update-config; start a new training and load old state instead
  • Removed dependency on pytorch_geometric
  • nequip-train no longer prints the full config, which can be found in the training dir as config.yaml.
  • nequip.datasets.AspirinDataset & nequip.datasets.WaterDataset
  • Dependency on pytorch_scatter

[0.3.3] - 2021-08-11

Added

  • to_ase method in AtomicData.py to convert AtomicData object to (list of) ase.Atoms object(s)
  • SequentialGraphNetwork now has insertion methods
  • nn.SaveForOutput
  • nequip-evaluate command for evaluating (metrics on) trained models
  • AtomicData.from_ase now catches energy/energies arrays

Changed

  • Nonlinearities now specified with e and o instead of 1 and -1
  • Update interfaces for torch_geometric 1.7.1 and e3nn 0.3.3
  • nonlinearity_scalars now also affects the nonlinearity used in the radial net of InteractionBlock
  • Cleaned up naming of initializers

Fixed

  • Fix specifying nonlinearities when wandb enabled
  • Final backport for <3.8 compatability
  • Fixed nequip-* commands when using pip install
  • Default models rescale per-atom energies, and not just total
  • Fixed Python <3.8 backward compatability with atomic_save

[0.3.2] - 2021-06-09

Added

  • Option for which nonlinearities to use
  • Option to save models every n epochs in training
  • Option to specify optimization defaults for e3nn

Fixed

  • Using wandb no longer breaks the inclusion of special objects like callables in configs

[0.3.1]

Fixed

  • iepoch is no longer off-by-one when restarting a training run that hit max_epochs
  • Builders, and not just sub-builders, use the class name as a default prefix

Added

  • early_stopping_xxx arguments added to enable early stop for platued values or values that out of lower/upper bounds.

[0.3.0] - 2021-05-07

Added

  • Sub-builders can be skipped in instantiate by setting them to None
  • More flexible model initialization
  • Add MD w/ Nequip-ASE-calculator + run-MD script w/ custom Nose-Hoover

Changed

  • PBC must be explicit if a cell is provided
  • Training now uses atomic file writes to avoid corruption if interupted
  • feature_embedding renamed to chemical_embedding in default models

Fixed

  • BesselBasis now works on GPU when trainable=False
  • Dataset extra_fixed_fields are now added even if get_data() returns AtomicData objects

[0.2.1] - 2021-05-03

Fixed

  • load_deployed_model now correctly loads all metadata

[0.2.0] - 2021-04-30