-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add classmethod to Hazard for reading raster-like data from NetCDF file #487
Comments
Great idea. Three points:
|
Some points from my side:
|
How should we handle optional information (i.e. information for which we can supply a default) in general? If we don't want users to always specify which data exactly to read from a file, we would need to use some kind of lookup. Consider the following: The data file contains a coordinate "frequency". By default, the user probably expects this to be loaded as hazard event frequency, even without stating a So I see these use cases:
Examples in code: # Signature:
def from_raster_netcdf(self, file, frequency=None, **kwargs):
pass
Hazard.from_raster_netcdf(file) # Load 'frequency' data if available, use default otherwise
Hazard.from_raster_netcdf(file, frequency="freq") # Load 'freq' data or throw error
Hazard.from_raster_netcdf(file, frequency="") # Ignore 'frequency' data, use default |
@chahank @emanuel-schmid @timschmi95 Some input/opinions would be welcome here 🙏
See #487 (comment) |
I like the general idea, but I am a bit confused by this use case. I would rather say that it does then not give any frequency value and the use should define it? |
Good point! I think the goal should be that the new method always returns a consistent Hazard object, meaning that it is ready to be used in computations. This is not the case if # Load hazard, ignoring the 'frequency' data in the file
hazard = Hazard.from_raster_netcdf(file, frequency="")
# Default frequency is loaded, 'hazard' is consistent
np.testing.assert_array_equal(hazard.frequency, np.ones(hazard.event_id.size))
# Overwrite the frequency
# NOTE: Exactly the same if hazard.frequency is None or the default np.array
hazard.frequency = my_fancy_frequency |
Good point, it should return a consistent object. |
The
Hazard
class offers several options to instantiate it from data files, e.g.from_raster
,from_excel
, etc. The classmethodfrom_raster
, in particular, usesrasterio
to open datasets and read their metadata, coordinates, and data. In this issue, I want to discuss if a general-purpose classmethod for reading data from a NetCDF file into a Hazard object might be useful, and how such a method could look like. A method implementing such a functionality to some extent can be found atclimada_petals/blob/feature/wildfire/climada_petals/hazard/wildfire.py#L2247
.What the method should do
Use a single NetCDF file to load data for a consistent instance of
Hazard
, meaning that if data is missing, it will be set to a sensible default.The minimal (i.e., essential) data supplied as variables in the file should be
Optional data could include:
Method signature
from_netcdf
should take the following arguments:data
(path-like orxarray.Dataset
, required): The dataset. Open the file if it is a path.intensity_var
(string, required): The name of the hazard intensity variable in the datasetfraction_var
(string, optional): The name of the hazard fraction variable in the datasetcoordinate_vars
(dict, optional): A mapping from default coordinate names to the variables used as coords in the dataset, e.g.dict(longitude="lon", latitude="y")
Method outline
Suppose a netCDF file contains the following data:
intensity
: 3D dataset (dims: "time", "longitude", "latitude")Then the following code creates a consistent
Hazard
instance from this data:The text was updated successfully, but these errors were encountered: