Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/impact write from hdf5 #606

Merged
merged 13 commits into from
Feb 2, 2023
Merged

Conversation

ThomasRoosli
Copy link
Collaborator

@ThomasRoosli ThomasRoosli commented Dec 14, 2022

Changes proposed in this PR:

  • introduce write_hdf5 and read_hdf5 methods in Impact
  • so far Impact.event_name does not have to be of type list(str) but just of type list. This is a requirement in Hazard, where this attribute generally originates from. I propose to make this the required type also in Impact. This would mean we have to change the docstring and some tests.

This PR fixes #

PR Author Checklist

PR Reviewer Checklist

Copy link
Member

@peanutfun peanutfun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets the job done, but I have to say that I find the overall structure of the functions a bit hard to understand. It currently iterates over all __dict__ entries, and then uses a large if/elif/else tree to do specific things with specific items. I would much prefer to instead define local "writer functions" and a mapping from attribute name or type to writer function. Then we can iterate over all attributes, check for a writer function or fall back to a default one. This would also help with some code duplications, specifically for writing all the tags. I'm thinking of something like this:

def write_str(name, value):
    data.attrs[name] = value
def write_csr(name, value):
    data.create_dataset(name, data=value.toarray())
def write_default(name, value):
    data.create_dataset(name, data=value)

attr_writers = {str: write_str, sparse.csr_matrix: write_csr}
for (var_name, var_val) in self.__dict__.items():
    # Fetch the write function if it exists for the type. If not, use the default one.
    write_func = attr_writers.get(type(var_val), write_default)
    # Call it with the attr name and value
    write_func(var_name, var_val)

(The same goes for the reader)

climada/engine/impact.py Outdated Show resolved Hide resolved
climada/engine/impact.py Outdated Show resolved Hide resolved
climada/engine/impact.py Outdated Show resolved Hide resolved
climada/engine/impact.py Outdated Show resolved Hide resolved
climada/engine/impact.py Outdated Show resolved Hide resolved
@peanutfun peanutfun self-assigned this Jan 24, 2023
@peanutfun peanutfun marked this pull request as draft January 24, 2023 16:21
@peanutfun peanutfun marked this pull request as ready for review January 27, 2023 13:20
@peanutfun
Copy link
Member

@ThomasRoosli I think I am finished. Would you have a look?

@ThomasRoosli ThomasRoosli changed the title WIP: Feature/impact write from hdf5 Feature/impact write from hdf5 Jan 30, 2023
@ThomasRoosli
Copy link
Collaborator Author

Thanks @peanutfun for restructuring especially the write_hdf5 function using type specific writers. And thanks for extending the tests. I am now very happy with the status of the code and the functionality it adds to the impact class. I am happy if we can merge this.

@peanutfun
Copy link
Member

@emanuel-schmid This new Impact.write_hdf5 uses generic writer functions. @ThomasRoosli and I talked about making them part of a utility module so that we can use these functions in any other <Class>.write_hdf5 method. We opted against it to simplify this PR. However, defining all the writer functions inside the write_hdf5 method makes the linter unhappy. What would you recommend: Split the generic writer functions into a utility module or keep them inside write_hdf5 at the expense of some added linter issues?

@emanuel-schmid
Copy link
Collaborator

emanuel-schmid commented Feb 1, 2023

Very nice! 😁

Split the generic writer functions into a utility module or keep them inside write_hdf5 at the expense of some added linter issues?

My preference would be to make it a TypeWriter class that takes the type_writers dict as __init__ argument and has one method, write. Could be part of the hdf5_handlers module or have a module of its own. I'm positive we'll be using it elsewhere, eventually.
However this can also be a PR of its own. Up to you. 😎

@peanutfun
Copy link
Member

@emanuel-schmid I don't want to over-complicate this PR. I like your proposal a lot, but I would want to discuss it first and then have a follow-up PR.

@peanutfun peanutfun merged commit f990035 into develop Feb 2, 2023
@emanuel-schmid emanuel-schmid deleted the feature/impact_write_from_hdf5 branch February 3, 2023 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants