[BE] Add smoke test for [escn,gemnet,equiformer_v2] train+predict, Add optimization test for [escn,gemnet,equiformer_v2] #640

misko · 2024-03-30T03:27:35Z

Add a test case for training a simple eSCN model on a small subset of OC22 (first 16 elements of train).

To suppor this I added the pickle dataset format. This is a simple list of pytorch geometric objects returned from the lmdb dataloaders.

def lmdb_dataset_to_pkl(lmdb_src, pickle_output_filename, n=128):
    ds = LmdbDataset(
        {
            "src": lmdb_src,
        }
    )
    pickle.dump([ds[idx] for idx in range(n)], open(pickle_output_filename, "wb"))

^ generate a pickle dataset from an existing lmdb dataset

We can use tests like this in the main repo and also in developing experimental branches to automatically test and confirm that we did not obviously break training.

codecov · 2024-03-30T16:26:37Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 62.29%. Comparing base (daf72a5) to head (910aa86).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #640      +/-   ##
==========================================
+ Coverage   57.27%   62.29%   +5.01%     
==========================================
  Files         109      109              
  Lines       10316    10316              
==========================================
+ Hits         5909     6426     +517     
+ Misses       4407     3890     -517

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mshuaibii · 2024-04-09T19:28:51Z

Documenting our chat offline:

Introducing another dataset is unnecessary, we can leverage existing readers even for the unit tests.
Rather than uploading the datasets directly to the repo, we can upload them to s3 and download them as part of the tests. The same way we currently handle checkpoints.
It would be great to include eqv2 instead or alongside escn as its the more commonly used model.
We should make sure to test the other modes as well

misko · 2024-04-16T17:09:35Z

Documenting our chat offline:

Introducing another dataset is unnecessary, we can leverage existing readers even for the unit tests.

I removed the pickle dataset, and used @r-barnes commit to use the tutorial dataset.

Rather than uploading the datasets directly to the repo, we can upload them to s3 and download them as part of the tests. The same way we currently handle checkpoints.

It would be great to include eqv2 instead or alongside escn as its the more commonly used model.

Added tests for eqv2 and gemnet

We should make sure to test the other modes as well

Added test for predict mode (alongside the existing train mode test)

tests/e2e/test_s2ef.py

lbluque · 2024-04-24T21:58:23Z

tests/e2e/test_s2ef.py

+        [
+            pytest.param("gemnet", 0.4, 0.06, id="gemnet"),
+            pytest.param("escn", 0.4, 0.06, id="escn"),
+            pytest.param("equiformer_v2", 0.4, 0.06, id="equiformer_v2"),


Can you use a syrupy snapshot here instead of hardcoding the expected energy and force mae?

Training is not reproducible to a reasonable accuracy :'( just due to numerical instability. In the above we want the metrics to be less than what is being specified not more. I dont think syrupy has a good way of doing this, but maybe i missed it

tests/e2e/test_s2ef.py

lbluque · 2024-04-24T22:10:47Z

tests/e2e/test_s2ef.py

+    return {"src": src}
+
+
+def oc20_lmdb_train_and_val_from_paths(train_src, val_src, test_src=None):


Assuming this is a function so that it is easy to add additional train, val, test src to the tests.

Would it be better to make this a fixture (and if needed can it can be parametrized with different dataset src when the time comes)

That was the intention, its only used in two ways in the tests and never both ways back to back.
One way is for training (train=tutorial_dataset,val=tutorial_dataset)
And the other in testing (train=tutorial_dataset,val=tutorial_dataset,test=tutorial_dataset)
I don't know how to parameterize it in some way to make it more readable. The examples i see online seem to show when we want to use all parameterizations of the fixture back to back.

codecov-commenter · 2024-05-09T22:00:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

see 20 files with indirect coverage changes

…oject/ocp into add_test_for_escn_train

lbluque

lgtm - thanks @misko!

…d optimization test for [escn,gemnet,equiformer_v2] (#640) * Add a simple pickle dataset type and a test case for escn training * fix import * lint * wrong paths * circleci gets a bit diff result than local, add buffer * Add S2EF e2e test * working e2e smoke test and short optimizer tests * remove unused pickle dataset support and data files * add torch deterministic * lint * lint again * clean up tests using parameterize, add tests for predict * lint * remove unused imports from test_escn * fixes * lint * fix lint * fix yaml paths * correct scaling path * promote up tests folder * fix up tests --------- Co-authored-by: Richard Barnes <rbarnes@umn.edu>

Add a simple pickle dataset type and a test case for escn training

63fe5e2

misko requested review from wood-b and mshuaibii March 30, 2024 03:27

misko added 4 commits March 30, 2024 15:49

fix import

f0d708d

lint

53283f6

wrong paths

ef0dee4

circleci gets a bit diff result than local, add buffer

dc94191

mshuaibii mentioned this pull request Apr 8, 2024

Add S2EF e2e test #547

Closed

mshuaibii marked this pull request as draft April 9, 2024 19:26

mshuaibii changed the title ~~Add a simple pickle dataset type and a test case for escn training~~ [BE] Add a simple pickle dataset type and a test case for escn training Apr 9, 2024

mshuaibii assigned misko Apr 9, 2024

mshuaibii added the enhancement New feature or request label Apr 9, 2024

lbluque added the test unit tests label Apr 9, 2024

r-barnes and others added 9 commits April 12, 2024 21:36

Add S2EF e2e test

71c4247

working e2e smoke test and short optimizer tests

de86437

remove unused pickle dataset support and data files

2a03732

add torch deterministic

8137bdc

lint

3a9f90c

lint again

36153b2

clean up tests using parameterize, add tests for predict

cf2578f

lint

fc944e3

Merge branch 'main' into add_test_for_escn_train

82d505b

misko changed the title ~~[BE] Add a simple pickle dataset type and a test case for escn training~~ [BE] Add smoke test for [escn,gemnet,equiformer_v2] train+predict, Add optimization test for [escn,gemnet,equiformer_v2] Apr 16, 2024

remove unused imports from test_escn

910aa86

misko marked this pull request as ready for review April 16, 2024 17:09

lbluque requested changes Apr 24, 2024

View reviewed changes

merged monorepo

8c09c7a

misko added 10 commits May 9, 2024 23:02

fixes

49df82c

lint

db81a0a

fix lint

d27725d

Merge branch 'main' into add_test_for_escn_train

76ab175

fix yaml paths

04ddddf

Merge branch 'add_test_for_escn_train' of github.com:Open-Catalyst-Pr…

5f81adb

…oject/ocp into add_test_for_escn_train

correct scaling path

c10bba1

Merge branch 'main' into add_test_for_escn_train

e92875e

promote up tests folder

de44ec6

fix up tests

b833599

misko force-pushed the add_test_for_escn_train branch from 49743a8 to b833599 Compare May 14, 2024 00:08

lbluque approved these changes May 14, 2024

View reviewed changes

misko added this pull request to the merge queue May 14, 2024

Merged via the queue into main with commit 18aec28 May 14, 2024
5 checks passed

misko deleted the add_test_for_escn_train branch May 14, 2024 16:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BE] Add smoke test for [escn,gemnet,equiformer_v2] train+predict, Add optimization test for [escn,gemnet,equiformer_v2] #640

[BE] Add smoke test for [escn,gemnet,equiformer_v2] train+predict, Add optimization test for [escn,gemnet,equiformer_v2] #640

misko commented Mar 30, 2024

codecov bot commented Mar 30, 2024 •

edited

Loading

mshuaibii commented Apr 9, 2024

misko commented Apr 16, 2024 •

edited

Loading

lbluque Apr 24, 2024

misko May 10, 2024

lbluque Apr 24, 2024

misko May 10, 2024

codecov-commenter commented May 9, 2024 •

edited by codecov bot

Loading

lbluque left a comment

		return {"src": src}


		def oc20_lmdb_train_and_val_from_paths(train_src, val_src, test_src=None):

[BE] Add smoke test for [escn,gemnet,equiformer_v2] train+predict, Add optimization test for [escn,gemnet,equiformer_v2] #640

[BE] Add smoke test for [escn,gemnet,equiformer_v2] train+predict, Add optimization test for [escn,gemnet,equiformer_v2] #640

Conversation

misko commented Mar 30, 2024

codecov bot commented Mar 30, 2024 • edited Loading

Codecov Report

mshuaibii commented Apr 9, 2024

misko commented Apr 16, 2024 • edited Loading

lbluque Apr 24, 2024

Choose a reason for hiding this comment

misko May 10, 2024

Choose a reason for hiding this comment

lbluque Apr 24, 2024

Choose a reason for hiding this comment

misko May 10, 2024

Choose a reason for hiding this comment

codecov-commenter commented May 9, 2024 • edited by codecov bot Loading

Codecov Report

lbluque left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 30, 2024 •

edited

Loading

misko commented Apr 16, 2024 •

edited

Loading

codecov-commenter commented May 9, 2024 •

edited by codecov bot

Loading