Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpointing, NetCDF, and golden master tests #140

Merged
merged 21 commits into from
Mar 21, 2019
Merged

Conversation

ali-ramadhan
Copy link
Member

@ali-ramadhan ali-ramadhan commented Mar 19, 2019

Call them system tests, model verification tests, or end-to-end tests, that's what this PR is about.

Currently a work in progress as I hit a roadblock with checkpointing so the tests will fail. See #141. BUT, for testing we can just set model.forcings = nothing and manually reconstruct the forcings as we know them.

Tests implemented so far:

  1. Checkpointing integration test: Run two coarse rising thermal bubble simulations and make sure that when restarting from a checkpoint, the restarted simulation matches the non-restarted simulation numerically.
  2. NetCDF output integration test: Run a coarse thermal bubble simulation and save the output to NetCDF at the 10th time step. Then read back the output and test that it matches the model's state.
  3. Thermal bubble golden master test: Run the coarse thermal bubble simulation for 10 time steps and check that the model output matches a golden master output.
  4. Deep convection golden master test

@jm-c @christophernhill: do post any ideas for tests that should be implemented in this PR.

@ali-ramadhan ali-ramadhan added the testing 🧪 Tests get priority in case of emergency evacuation label Mar 19, 2019
@ali-ramadhan ali-ramadhan added this to the v0.5 milestone Mar 19, 2019
@codecov
Copy link

codecov bot commented Mar 20, 2019

Codecov Report

Merging #140 into master will increase coverage by 3.21%.
The diff coverage is 77.77%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #140      +/-   ##
========================================
+ Coverage   56.78%    60%   +3.21%     
========================================
  Files          19     19              
  Lines         597    620      +23     
========================================
+ Hits          339    372      +33     
+ Misses        258    248      -10
Impacted Files Coverage Δ
src/models.jl 80% <ø> (ø) ⬆️
src/output_writers.jl 37.64% <77.77%> (+37.64%) ⬆️
src/time_steppers.jl 67.72% <0%> (+0.63%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 47b4613...b880a9b. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 20, 2019

Codecov Report

Merging #140 into master will increase coverage by 9.94%.
The diff coverage is 78.94%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #140      +/-   ##
==========================================
+ Coverage   56.03%   65.97%   +9.94%     
==========================================
  Files          19       19              
  Lines         605      629      +24     
==========================================
+ Hits          339      415      +76     
+ Misses        266      214      -52
Impacted Files Coverage Δ
src/models.jl 80% <ø> (ø) ⬆️
src/output_writers.jl 79.78% <78.94%> (+79.78%) ⬆️
src/time_steppers.jl 67.72% <0%> (+0.63%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7163560...371af73. Read the comment docs.

metadata, grid, stepper_tmp = model.metadata, model.grid, model.stepper_tmp
if metadata.arch == :cpu
if metadata.arch == :CPU
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we save FFT plans?

It makes sense that a user should plan new FFTs if they are going to run a model, even if they are restarting a checkpoint on the same computer.

@christophernhill what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't. We set model.poisson_solver = nothing before checkpointing then reconstruct the FFT plans afterwards.

So you're saying the user should be manually planning the FFTs themselves?

@ali-ramadhan
Copy link
Member Author

ali-ramadhan commented Mar 20, 2019

Note to self: Seems like the golden master tests fail on Mac (Travis CI) so it seems that JLD might not be portable (although I generated the JLD checkpoint files on Windows and the Linux tests pass).

So for golden master testing I should rely on NetCDF output which is guaranteed to be portable.

EDIT: Hmmm, it might actually just be finicky (sometimes passes, sometimes doesn't). Another reason to use NetCDF. But so far if I keep running the tests on my laptop it always passes.

@ali-ramadhan ali-ramadhan marked this pull request as ready for review March 21, 2019 14:49
I think there's an issue with the pseudorandom number generator being 
portable. We still have the thermal bubble golden master test.
@ali-ramadhan ali-ramadhan self-assigned this Mar 21, 2019
@ali-ramadhan
Copy link
Member Author

@jm-c and I discussed this PR yesterday and seemed good to merge. Looks like the tests work now so I'm merging.

@ali-ramadhan ali-ramadhan changed the title [WIP] System tests Checkpointing, NetCDF, and golden master tests Mar 21, 2019
@ali-ramadhan ali-ramadhan merged commit ecc0e46 into master Mar 21, 2019
@ali-ramadhan ali-ramadhan deleted the system-tests branch March 28, 2019 13:32
arcavaliere pushed a commit to arcavaliere/Oceananigans.jl that referenced this pull request Nov 6, 2019
Checkpointing, NetCDF, and golden master tests

Former-commit-id: ecc0e46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing 🧪 Tests get priority in case of emergency evacuation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants