Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimensionless Units #55

Merged
merged 42 commits into from
Nov 25, 2024
Merged

Dimensionless Units #55

merged 42 commits into from
Nov 25, 2024

Conversation

pgierz
Copy link
Member

@pgierz pgierz commented Nov 8, 2024

Work for dimensionless units. Will close #54

@pgierz pgierz linked an issue Nov 8, 2024 that may be closed by this pull request
@pgierz pgierz added this to the Beta Release milestone Nov 8, 2024
@pgierz pgierz added documentation Improvements or additions to documentation enhancement New feature or request interface backend labels Nov 8, 2024
@pgierz
Copy link
Member Author

pgierz commented Nov 8, 2024

For developers: the unitless mapping table is read when the CMORizer object gets initialised (either directly, this is uncommon) or via the classmethod from_dict (this is the common case), and attached to all rules that the CMORizer knows about. You can access them on the rule object as attribute dimensionless_unit_mappings.

@siligam
Copy link
Contributor

siligam commented Nov 8, 2024

@pgierz , nice to see the dimensionless_unit_mappings set on rule. I guess unit conversion function should looked out for this attribute and act accordingly. If so, I will make necessary changes to that function. One follow up question: Should the units in the final output point to units in the CMIP table or the one used in this unit_mapping? For instance, "0.001" is the unit in the table. Its unit_mapping says "g/kg". The later is used by the unit conversion function. I guess the unit representation in the final output should still be "0.001" as suggested in the table. Right?

@pgierz
Copy link
Member Author

pgierz commented Nov 8, 2024

I guess the unit representation in the final output should still be "0.001" as suggested in the table. Right?

Correct. At the end of the CMOR process, we need to have metadata attributes as defined in the CMOR Tables.

@mandresm
Copy link
Contributor

mandresm commented Nov 8, 2024

Hi guys, just so I make sure I am following, this are the steps and tooling needed correct?

  • read the dimensionless_unit_mappings. @pgierz this is also what you are saying here as well right?

the unitless mapping table is read when the CMORizer object gets initialised (either directly, this is uncommon) or via the classmethod from_dict (this is the common case), and attached to all rules that the CMORizer knows about

  • Prepare checks raise errors if:

    • a. The user has not specified a "pint-understandable" unit and the original unit in the file doesn't make any sense to pint
    • b. One of the rules is trying to convert to a unit in the CMOR table that has no mapping to a pint-meaningfull physical variable
  • Modify the unit conversion to get the attribute from CMOR table where we stored the physically meaningfull unit (g/kg) and use that to do the conversion.

Any step missing?

I'd suggest we start writing the dimensionless_unit_mappings yaml with just one variable to test, for example one of the salinities that @siligam have shown from his tables.

@mandresm mandresm mentioned this pull request Nov 8, 2024
@pgierz
Copy link
Member Author

pgierz commented Nov 8, 2024

  1. read the dimensionless_unit_mappings and assign them to their corresponding CMOR variables. @pgierz this is also what you are saying here as well right?

the unitless mapping table is read when the CMORizer object gets initialised (either directly, this is uncommon) or via the classmethod from_dict (this is the common case), and attached to all rules that the CMORizer knows about

Partially. The mappings are read and all are assigned to each Rule: no filtering is done. If you want each Rule to only know about one specific mapping, that is easy enough to do, I can add that.

@pgierz
Copy link
Member Author

pgierz commented Nov 8, 2024

I'd suggest we start writing the dimensionless_unit_mappings yaml with just one variable to test, for example one of the salinities that @siligam have shown from his tables.

Probably it would be a good idea to turn that into a unit test so it is run automatically.

@siligam
Copy link
Contributor

siligam commented Nov 8, 2024

I will implement the tastcase for salinity using the dimensionless_unit_mappings feature.

Just for reference, model-table unit listing https://notes.desy.de/CXe0axlgSbStWgOLjaXnlQ?view

@mandresm
Copy link
Contributor

mandresm commented Nov 8, 2024

  1. read the dimensionless_unit_mappings and assign them to their corresponding CMOR variables. @pgierz this is also what you are saying here as well right?

the unitless mapping table is read when the CMORizer object gets initialised (either directly, this is uncommon) or via the classmethod from_dict (this is the common case), and attached to all rules that the CMORizer knows about

Partially. The mappings are read and all are assigned to each Rule: no filtering is done. If you want each Rule to only know about one specific mapping, that is easy enough to do, I can add that.

I'm not understanding something: don't the rules belong to the model side of things?

Why would we need to append it to the rule and not to the CMOR object? The mapping is for the CMOR unit, not the model unit. Please guys, let me know if I need to reread the documentation or the code, I might be saying bullshit here

@pgierz
Copy link
Member Author

pgierz commented Nov 8, 2024

Please guys, let me know if I need to reread the documentation or the code, I might be saying bullshit here

This is badly documented, so read the code ;-) I am talking about the CMORizer object here (the main controller) which has information about everything It's the part that controls the parallelisation and knows which rules exist, what the user configuration looks like, how to match up model files to cmor table entries etc etc.

At the moment, it looks like this:

>>> user_config_file = pathlib.Path("/some/file.yaml")
>>> f = yaml.safe_load(user_config_file)
>>> cmorizer = CMORizer.from_dict(f)
>>> type(cmorizer.rules)
list
>>> first_rule = cmorizer.rules[0]
>>> first_rule.model_variable
"sst"
>>> first_rule.cmor_variable
"sos"
>>> first_rule.dimensionless_unit_mappings
{"sst": {"0.001": "g/kg"}, "other_key": {"1": "µmol mol^-1"}, ...}

Of course this example is nonsense, sst isn't a concentration, but just for illustrative purposes of what sorts of things are where in the objects.

@christian-stepanek
Copy link

I guess the unit representation in the final output should still be "0.001" as suggested in the table. Right?

Correct. At the end of the CMOR process, we need to have metadata attributes as defined in the CMOR Tables.

Yes, the units in the tables are the target, they are the convention.

pgierz and others added 3 commits November 8, 2024 13:59
Better example in the example. SST isn't meaningful for this use case, we instead will test with salinity of the ocean (variable `so`)

Co-authored-by: Miguel <63242832+mandresm@users.noreply.github.com>
mandresm and others added 3 commits November 19, 2024 13:44
…d a test for checking missing mapping in the dimensionless mapping (not yet working as the checker has not yet been implemented)
src/pymorize/cmorizer.py Outdated Show resolved Hide resolved
@mandresm mandresm marked this pull request as ready for review November 19, 2024 18:23
Copy link
Contributor

@mandresm mandresm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a rework, I am approving this. @pgierz, you can go ahead and review it yourself.

We might want to break down handle_unit_conversion into smaller functions, or even steps in the pipeline, but I'd wait to see how long it gets after #74 and decide then

Copy link
Contributor

@mandresm mandresm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit tests are failing because of this:

FAILED tests/unit/test_units.py::test_units_with_g_kg_to_0001_g_kg - FileNotFoundError: [Errno 2] No such file or directory: '/opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/data/dimensionless_mappings.yaml'

https://github.com/esm-tools/pymorize/actions/runs/11919421295/job/33219085869?pr=55#step:8:589

Somehow data is not included as resources in the pip package. I will take care of that tomorrow.

@mandresm
Copy link
Contributor

@pgierz, does this commit make sense to you or is there a better way of doing this keeping the data directory in the root directory? bc5586a

@mandresm
Copy link
Contributor

Blocked by #66

setup.py Outdated Show resolved Hide resolved
src/pymorize/cmorizer.py Outdated Show resolved Hide resolved
@mandresm
Copy link
Contributor

Tests are failing, I'll look into that on Monday

@mandresm mandresm self-requested a review November 25, 2024 14:48
@mandresm mandresm merged commit 3042e47 into main Nov 25, 2024
4 checks passed
@mandresm mandresm deleted the feat/dimensionless-units branch November 25, 2024 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend documentation Improvements or additions to documentation enhancement New feature or request interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dimensionless and other strange unit conversion PSU Conversion
5 participants