Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for easier definition of new forcings #393

Merged
merged 29 commits into from
Mar 25, 2024
Merged

Conversation

BSchilperoort
Copy link
Member

@BSchilperoort BSchilperoort commented Feb 22, 2024

Changes:

Added

  • Added new forcing classes (#393):
    • LumpedUserForcing, DistributedUserForcing - have the variable names as an argument, as well as an optional post-processor that can derive addition variables from the downloaded data.
    • LumpedMakkinkForcing, DistributedMakkinkForcing - based on the UserForcing: selects the required variables and computes the Makkink potential evaporation.
    • LumpedPenmanMonteithForcing, DistributedPenmanMonteithForcing - based on the UserForcing: includes all meteo variables required to compute the Penman-Monteith potential evaporation. More (parameterset) info such as plant type and height are required to compute the actual E_pot.

Changed

  • Internal changes to DefaultForcing: variables are no longer attributes of the class, but are stored under a "filenames" (dict) attribute (#393).
  • Updated the documentation to reflect the changes in the forcing generation (#393).

I also tried to silence most useless esmvalcore warnings.

I split up the user_guide notebook into 4 parts, because it was too big to easily modify and run. They are now organized under a separate section on the docs, preserving their original structure.

A demo notebook is available here:
_demo_forcing_builder.ipynb.zip

Also closes #232, #387

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@BSchilperoort BSchilperoort marked this pull request as ready for review March 12, 2024 16:03
@BSchilperoort BSchilperoort linked an issue Mar 13, 2024 that may be closed by this pull request
Copy link
Member

@sverhoeven sverhoeven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new forcing classes look great.

I was able to generate forcing using GenericDistributedForcing, LumpedMakkinkForcing, LumpedUserForcing.

Initially LumpedMakkinkForcing.generate example gaveRecipeError: Could not create all tasks with empty esmvaltool log files. When I disable the log swallowing in run_recipe(), the log files are filled again with esgf creds error. The log swallowing is a bit to much, now can dial it back.

In docs/adding_models.rst we should no longer just mention the Default and Generic forcings. It would be nice to link back to the forcing user guide.

Can you add tests for some of the new algorithms like sources, makkink and merge_esmvaltool_datasets?

The CI is red. Can you make it green?


_model_entrypoints = entry_points(group="ewatercycle.models") # /NOSONAR

# Expose as "from ewatercycle.models import Model" for backward compatibility
for _model in _model_entrypoints:
globals()[_model.name] = _model.load()


class ModelSources(Mapping):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class look very much like ForcingSource except the return type and its super is from typing vs collections.abc . Can we have the same super? Can we make a Source super class with a generic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are very similar yes. In some way it would be nicer to be able to list models, forcings, and parameter sets all in the same way.

Before this change it was:

  • ewatercycle.forcing.sources
  • dir(ewatercycle.models) which also was cluttered with non-useful info
  • ewatercycle.parameter_sets.available_parameter_sets()

It would also be nice to still unify how to access these... three different ways for what is essentially the same is not very user friendly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have the same super? Can we make a Source super class with a generic?

Done.

It would also be nice to still unify models, forcings, and parameter

I've now unified models and forcings, however parameter_sets it still tbd. Probably better in a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice if you did not need a subclass with getitem method.
I tried Mapping[str, DefaultForcing] and UserDict instead of Mapping, but got stuck on value being a class or an entrypoint.

On my machine it does not make much difference if load() is called or not:

In [1]: from importlib.metadata import entry_points
In [4]: %timeit [f.load() for f in entry_points(group="ewatercycle.forcings")]
7.33 ms ± 158 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [5]: %timeit entry_points(group="ewatercycle.forcings")
7.3 ms ± 81.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So we could simplify by calling the load() on import.

Copy link
Member Author

@BSchilperoort BSchilperoort Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subclassing here is to get the typing correct, not to delay the loading.

for the model sources:

    def __getitem__(self, key) -> Type[eWaterCycleModel]:

while the forcing sources are:

    def __getitem__(self, key) -> Type[DefaultForcing]:

It also allowed me to write a different docstring with appropriate examples for the model vs. forcing sources.

@BSchilperoort
Copy link
Member Author

Initially LumpedMakkinkForcing.generate example gaveRecipeError: Could not create all tasks with empty esmvaltool log files. When I disable the log swallowing in run_recipe(), the log files are filled again with esgf creds error. The log swallowing is a bit to much, now can dial it back.

I'll change the warning level to "error" so actual errors are still shown, but useless warning are still hidden.
I find it a bit frustrating that esmvaltool just dumps everything to the terminal and if you quiet the logging it also doesn't write it to file :/

Copy link
Member

@sverhoeven sverhoeven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Only found 2 links in docs that are 404.

@Peter9192
Copy link
Collaborator

Great progress!

@BSchilperoort
Copy link
Member Author

BSchilperoort commented Mar 25, 2024

I tested all models:

  • leakybucket: needs update of plugin code (this PR breaks it)
  • wflow: needs update in tests only (unexpected filenames={} in forcing object). model runs with old forcing.
  • lisflood: needs update in tests only (unexpected filenames={} in forcing object). model runs with old forcing.
  • hype: needs update in tests only (unexpected filenames={} in forcing object). Can't run model due to unrelated (known) issue GRPC server is down when running the model.ipynb ewatercycle-hype#3
  • pcrglobwb: needs update in tests only (unexpected filenames={} in forcing object). model runs with old forcing.
  • marrmot: needs update in tests (unexpected filenames={} in forcing object). models run fine.

Merge + release should be OK. I'll open issues on the respective repos to track the tests breaking.

@BSchilperoort BSchilperoort merged commit 58a0150 into main Mar 25, 2024
2 of 3 checks passed
@BSchilperoort BSchilperoort deleted the additional-forcings branch March 25, 2024 10:29
Copy link

@sverhoeven
Copy link
Member

I tested all models:

  • leakybucket: needs update of plugin code (this PR breaks it)
  • wflow: needs update in tests only (unexpected filenames={} in forcing object). model runs with old forcing.
  • lisflood: needs update in tests only (unexpected filenames={} in forcing object). model runs with old forcing.
  • hype: needs update in tests only (unexpected filenames={} in forcing object). Can't run model due to unrelated (known) issue GRPC server is down when running the model.ipynb ewatercycle-hype#3
  • pcrglobwb: needs update in tests only (unexpected filenames={} in forcing object). model runs with old forcing.
  • marrmot: needs update in tests (unexpected filenames={} in forcing object). models run fine.

Merge + release should be OK. I'll open issues on the respective repos to track the tests breaking.

Ah we are checking the dump of a forcing object in the tests, makes sense that those tests fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Bart to CITATION.cff Add model.setup(cfg_dir=...) example to user guide
3 participants