Skip to content

Commit

Permalink
Allow reading Hazard events that are not dates from xarray (#837)
Browse files Browse the repository at this point in the history
* Allow reading Hazard events that are not dates from xarray

* Set default value of `Hazard.event_name` to empty string.
* Try interpreting values of the event coordinate as dates or ordinals
  for default values of `Hazard.date`. If that fails, issue a warning
  and set default values to zeros.
* Update tests.

* Try to read event coordinate as date

* Update climada/hazard/base.py

* Apply formatter to test_base_xarray.py

* Set default ordinal to 1, fix tests

* Fix linter warnings

* Switch back to class setup for xarray tests

* Clarify docstring of Hazard.from_xarray_raster

* Update CHANGELOG.md

* Update climada/hazard/base.py

---------

Co-authored-by: Chahan M. Kropf <chahan.kropf@usys.ethz.ch>
  • Loading branch information
peanutfun and chahank authored Jan 16, 2024
1 parent 104dfdb commit 7e9b4f8
Show file tree
Hide file tree
Showing 3 changed files with 90 additions and 17 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Code freeze date: YYYY-MM-DD
- Update `CONTRIBUTING.md` to better explain types of contributions to this repository [#797](https://github.com/CLIMADA-project/climada_python/pull/797)
- The default tile layer in Exposures maps is not Stamen Terrain anymore, but [CartoDB Positron](https://github.com/CartoDB/basemap-styles). Affected methods are `climada.engine.Impact.plot_basemap_eai_exposure`,`climada.engine.Impact.plot_basemap_impact_exposure` and `climada.entity.Exposures.plot_basemap`. [#798](https://github.com/CLIMADA-project/climada_python/pull/798)
- Recommend using Mamba instead of Conda for installing CLIMADA [#809](https://github.com/CLIMADA-project/climada_python/pull/809)
- `Hazard.from_xarray_raster` now allows arbitrary values as 'event' coordinates [#837](https://github.com/CLIMADA-project/climada_python/pull/837)

### Fixed

Expand Down
78 changes: 61 additions & 17 deletions climada/hazard/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -463,12 +463,13 @@ def from_xarray_raster(
):
"""Read raster-like data from an xarray Dataset
This method reads data that can be interpreted using three coordinates for event,
latitude, and longitude. The data and the coordinates themselves may be organized
in arbitrary dimensions in the Dataset (e.g. three dimensions 'year', 'month',
'day' for the coordinate 'event'). The three coordinates to be read can be
specified via the ``coordinate_vars`` parameter. See Notes and Examples if you
want to load single-event data that does not contain an event dimension.
This method reads data that can be interpreted using three coordinates: event,
latitude, and longitude. The names of the coordinates to be read from the
dataset can be specified via the ``coordinate_vars`` parameter. The data and the
coordinates themselves may be organized in arbitrary dimensions (e.g. two
dimensions 'year' and 'altitude' for the coordinate 'event'). See Notes and
Examples if you want to load single-event data that does not contain an event
dimension.
The only required data is the intensity. For all other data, this method can
supply sensible default values. By default, this method will try to find these
Expand Down Expand Up @@ -513,12 +514,14 @@ def from_xarray_raster(
Default values are:
* ``date``: The ``event`` coordinate interpreted as date
* ``date``: The ``event`` coordinate interpreted as date or ordinal, or
ones if that fails (which will issue a warning).
* ``fraction``: ``None``, which results in a value of 1.0 everywhere, see
:py:meth:`Hazard.__init__` for details.
* ``hazard_type``: Empty string
* ``frequency``: 1.0 for every event
* ``event_name``: String representation of the event time
* ``event_name``: String representation of the event date or empty strings
if that fails (which will issue a warning).
* ``event_id``: Consecutive integers starting at 1 and increasing with time
crs : str, optional
Identifier for the coordinate reference system of the coordinates. Defaults
Expand Down Expand Up @@ -553,13 +556,16 @@ def from_xarray_raster(
and Examples) before loading the Dataset as Hazard.
* Single-valued data for variables ``frequency``. ``event_name``, and
``event_date`` will be broadcast to every event.
* The ``event`` coordinate may take arbitrary values. In case these values
cannot be interpreted as dates or date ordinals, the default values for
``Hazard.date`` and ``Hazard.event_name`` are used, see the
``data_vars``` parameter documentation above.
* To avoid confusion in the call signature, several parameters are keyword-only
arguments.
* The attributes ``Hazard.haz_type`` and ``Hazard.unit`` currently cannot be
read from the Dataset. Use the method parameters to set these attributes.
* This method does not read coordinate system metadata. Use the ``crs`` parameter
to set a custom coordinate system identifier.
* This method **does not** read lazily. Single data arrays must fit into memory.
Examples
--------
Expand Down Expand Up @@ -802,14 +808,48 @@ def strict_positive_int_accessor(array: xr.DataArray) -> np.ndarray:
raise ValueError(f"'{array.name}' data must be larger than zero")
return array.values

def date_to_ordinal_accessor(array: xr.DataArray) -> np.ndarray:
def date_to_ordinal_accessor(
array: xr.DataArray, strict: bool = True
) -> np.ndarray:
"""Take a DataArray and transform it into ordinals"""
if np.issubdtype(array.dtype, np.integer):
# Assume that data is ordinals
return strict_positive_int_accessor(array)
try:
if np.issubdtype(array.dtype, np.integer):
# Assume that data is ordinals
return strict_positive_int_accessor(array)

# Try transforming to ordinals
return np.array(u_dt.datetime64_to_ordinal(array.values))

# Handle access errors
except (ValueError, TypeError) as err:
if strict:
raise err

LOGGER.warning(
"Failed to read values of '%s' as dates or ordinals. Hazard.date "
"will be ones only",
array.name,
)
return np.ones(array.shape)

def year_month_day_accessor(
array: xr.DataArray, strict: bool = True
) -> np.ndarray:
"""Take an array and return am array of YYYY-MM-DD strings"""
try:
return array.dt.strftime("%Y-%m-%d").values

# Handle access errors
except (ValueError, TypeError) as err:
if strict:
raise err

# Try transforming to ordinals
return np.array(u_dt.datetime64_to_ordinal(array.values))
LOGGER.warning(
"Failed to read values of '%s' as dates. Hazard.event_name will be "
"empty strings",
array.name,
)
return np.full(array.shape, "")

def maybe_repeat(values: np.ndarray, times: int) -> np.ndarray:
"""Return the array or repeat a single-valued array
Expand Down Expand Up @@ -840,8 +880,12 @@ def maybe_repeat(values: np.ndarray, times: int) -> np.ndarray:
None,
np.ones(num_events),
np.array(range(num_events), dtype=int) + 1,
data[coords["event"]].dt.strftime("%Y-%m-%d").values.flatten().tolist(),
np.array(u_dt.datetime64_to_ordinal(data[coords["event"]].values)),
list(
year_month_day_accessor(
data[coords["event"]], strict=False
).flat
),
date_to_ordinal_accessor(data[coords["event"]], strict=False),
],
# The accessor for the data in the Dataset
accessor=[
Expand Down
28 changes: 28 additions & 0 deletions climada/hazard/test/test_base_xarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,33 @@ def test_type_and_unit(self):
self.assertEqual(hazard.haz_type, "TC")
self.assertEqual(hazard.units, "m/s")

def test_event_no_time(self):
"""Test if an event coordinate that is not a time works"""
with xr.open_dataset(self.netcdf_path) as dataset:
size = dataset.sizes["time"]

# Positive integers (interpreted as ordinals)
time = [2, 1]
dataset["time"] = time
hazard = Hazard.from_xarray_raster(dataset, "", "")
self._assert_default_types(hazard)
np.testing.assert_array_equal(
hazard.intensity.toarray(), [[0, 1, 2, 3, 4, 5], [6, 7, 8, 9, 10, 11]]
)
np.testing.assert_array_equal(hazard.date, time)
np.testing.assert_array_equal(hazard.event_name, np.full(size, ""))

# Strings
dataset["time"] = ["a", "b"]
with self.assertLogs("climada.hazard.base", "WARNING") as cm:
hazard = Hazard.from_xarray_raster(dataset, "", "")
np.testing.assert_array_equal(hazard.date, np.ones(size))
np.testing.assert_array_equal(hazard.event_name, np.full(size, ""))
self.assertIn("Failed to read values of 'time' as dates.", cm.output[0])
self.assertIn(
"Failed to read values of 'time' as dates or ordinals.", cm.output[1]
)

def test_data_vars(self):
"""Check handling of data variables"""
with xr.open_dataset(self.netcdf_path) as dataset:
Expand Down Expand Up @@ -571,6 +598,7 @@ def test_errors(self):
coordinate_vars=dict(latitude="lalalatitude"),
)


# Execute Tests
if __name__ == "__main__":
TESTS = unittest.TestLoader().loadTestsFromTestCase(TestReadDefaultNetCDF)
Expand Down

0 comments on commit 7e9b4f8

Please sign in to comment.