Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BE] Add smoke test for [escn,gemnet,equiformer_v2] train+predict, Add optimization test for [escn,gemnet,equiformer_v2] #640

Merged
merged 26 commits into from
May 14, 2024

Conversation

misko
Copy link
Collaborator

@misko misko commented Mar 30, 2024

Add a test case for training a simple eSCN model on a small subset of OC22 (first 16 elements of train).

To suppor this I added the pickle dataset format. This is a simple list of pytorch geometric objects returned from the lmdb dataloaders.

def lmdb_dataset_to_pkl(lmdb_src, pickle_output_filename, n=128):
    ds = LmdbDataset(
        {
            "src": lmdb_src,
        }
    )
    pickle.dump([ds[idx] for idx in range(n)], open(pickle_output_filename, "wb"))

^ generate a pickle dataset from an existing lmdb dataset

We can use tests like this in the main repo and also in developing experimental branches to automatically test and confirm that we did not obviously break training.

@misko misko requested review from wood-b and mshuaibii March 30, 2024 03:27
Copy link

codecov bot commented Mar 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 62.29%. Comparing base (daf72a5) to head (910aa86).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #640      +/-   ##
==========================================
+ Coverage   57.27%   62.29%   +5.01%     
==========================================
  Files         109      109              
  Lines       10316    10316              
==========================================
+ Hits         5909     6426     +517     
+ Misses       4407     3890     -517     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mshuaibii mshuaibii mentioned this pull request Apr 8, 2024
@mshuaibii mshuaibii marked this pull request as draft April 9, 2024 19:26
@mshuaibii
Copy link
Collaborator

Documenting our chat offline:

  • Introducing another dataset is unnecessary, we can leverage existing readers even for the unit tests.
  • Rather than uploading the datasets directly to the repo, we can upload them to s3 and download them as part of the tests. The same way we currently handle checkpoints.
  • It would be great to include eqv2 instead or alongside escn as its the more commonly used model.
  • We should make sure to test the other modes as well

@mshuaibii mshuaibii changed the title Add a simple pickle dataset type and a test case for escn training [BE] Add a simple pickle dataset type and a test case for escn training Apr 9, 2024
@mshuaibii mshuaibii added the enhancement New feature or request label Apr 9, 2024
@lbluque lbluque added the test unit tests label Apr 9, 2024
@misko misko changed the title [BE] Add a simple pickle dataset type and a test case for escn training [BE] Add smoke test for [escn,gemnet,equiformer_v2] train+predict, Add optimization test for [escn,gemnet,equiformer_v2] Apr 16, 2024
@misko
Copy link
Collaborator Author

misko commented Apr 16, 2024

Documenting our chat offline:

  • Introducing another dataset is unnecessary, we can leverage existing readers even for the unit tests.

I removed the pickle dataset, and used @r-barnes commit to use the tutorial dataset.

  • Rather than uploading the datasets directly to the repo, we can upload them to s3 and download them as part of the tests. The same way we currently handle checkpoints.
  • It would be great to include eqv2 instead or alongside escn as its the more commonly used model.

Added tests for eqv2 and gemnet

  • We should make sure to test the other modes as well

Added test for predict mode (alongside the existing train mode test)

@misko misko marked this pull request as ready for review April 16, 2024 17:09
tests/e2e/test_s2ef.py Outdated Show resolved Hide resolved
[
pytest.param("gemnet", 0.4, 0.06, id="gemnet"),
pytest.param("escn", 0.4, 0.06, id="escn"),
pytest.param("equiformer_v2", 0.4, 0.06, id="equiformer_v2"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a syrupy snapshot here instead of hardcoding the expected energy and force mae?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Training is not reproducible to a reasonable accuracy :'( just due to numerical instability. In the above we want the metrics to be less than what is being specified not more. I dont think syrupy has a good way of doing this, but maybe i missed it

tests/e2e/test_s2ef.py Outdated Show resolved Hide resolved
tests/e2e/test_s2ef.py Outdated Show resolved Hide resolved
return {"src": src}


def oc20_lmdb_train_and_val_from_paths(train_src, val_src, test_src=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is a function so that it is easy to add additional train, val, test src to the tests.

Would it be better to make this a fixture (and if needed can it can be parametrized with different dataset src when the time comes)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the intention, its only used in two ways in the tests and never both ways back to back.
One way is for training (train=tutorial_dataset,val=tutorial_dataset)
And the other in testing (train=tutorial_dataset,val=tutorial_dataset,test=tutorial_dataset)
I don't know how to parameterize it in some way to make it more readable. The examples i see online seem to show when we want to use all parameterizations of the fixture back to back.

@codecov-commenter
Copy link

codecov-commenter commented May 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

see 20 files with indirect coverage changes

Copy link
Collaborator

@lbluque lbluque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - thanks @misko!

@misko misko added this pull request to the merge queue May 14, 2024
Merged via the queue into main with commit 18aec28 May 14, 2024
5 checks passed
@misko misko deleted the add_test_for_escn_train branch May 14, 2024 16:58
levineds pushed a commit that referenced this pull request Jul 11, 2024
…d optimization test for [escn,gemnet,equiformer_v2] (#640)

* Add a simple pickle dataset type and a test case for escn training

* fix import

* lint

* wrong paths

* circleci gets a bit diff result than local, add buffer

* Add S2EF e2e test

* working e2e smoke test and short optimizer tests

* remove unused pickle dataset support and data files

* add torch deterministic

* lint

* lint again

* clean up tests using parameterize, add tests for predict

* lint

* remove unused imports from test_escn

* fixes

* lint

* fix lint

* fix yaml paths

* correct scaling path

* promote up tests folder

* fix up tests

---------

Co-authored-by: Richard Barnes <rbarnes@umn.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request test unit tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants