@deltamarnix mentioned that it is good to first write up what we want to test additionally to the stuff currently in our CI. We can then work on automating these if possible.
Test cases I'm currently thinking of:
Broad metrics we could test for:
- Did the test run without error, did the validation succeed?
- Are the model input files written by iMOD Python the same (account for differences in sorting of List-based input)?
- Are iMOD Python's run times in a similar ballpark?
- Are there large differences in memory usage? Did memory usage explode?
- How much does model output differ? (Compute some basic statistics, mean, min, max difference)
- Make some plots of model output differences, these can help identifying differences.