Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split hazard module #871

Merged
merged 234 commits into from
May 8, 2024
Merged

split hazard module #871

merged 234 commits into from
May 8, 2024

Conversation

emanuel-schmid
Copy link
Collaborator

@emanuel-schmid emanuel-schmid commented Apr 12, 2024

Changes proposed in this PR:

  • divide the Hazard class into 3 modules, base, io and plot.
  • the hazard.base module had more than 2000 lines and has passed the limits of best practices code file size by far.
  • with this PR the Hazard class is inheriting from two classes which are preoccupied with read/write and plotting methods respectively
  • for the user of the class nothing has changed at all.

This PR fixes #

PR Author Checklist

PR Reviewer Checklist

@emanuel-schmid emanuel-schmid changed the base branch from main to develop April 12, 2024 14:39
@chahank
Copy link
Member

chahank commented May 8, 2024

Excellent PR! This will make the code much cleaner.

Regarding the tests, should the test for IO and Plot be made integration tests only? And should the integration tests for hazard also be split up? And should we aim (not in this PR) to add test for the Plot parts in the future?

@emanuel-schmid
Copy link
Collaborator Author

Should the test for IO and Plot be made integration tests only?
🤷 Imho:

  • plots ought to be tested in integration tests only. For two reasons.
    1. how to write a proper unit test for plots is an unsolved problem.
    2. they take too long.
  • io on the other hand can partly stay in unit tests, as long as
    it's fast and goes along the line assert read(write(x) == x.

Should the integration tests for hazard be split up?
No strong feelings. But, in general, I agree: the modularization of the integration tests could be better structured. Currently we have somehow a mix of distribution patterns: single tested method (test_multi_processing), single tested module (eg, test_nightlight), multiple tested modules together (test_engine), functional category (test_plot, test_multi_processing as meant to be?). Hence assigning a test to an existing test module is somewhat arbitrary.

Adding tests for the Plot parts?
Yes! (remember: we always aim for coverage 100% 😁)

@emanuel-schmid
Copy link
Collaborator Author

testing policy discussion to be continued

@emanuel-schmid emanuel-schmid merged commit 6a117bd into develop May 8, 2024
12 checks passed
gdeskos pushed a commit to gdeskos/climada_python that referenced this pull request Sep 16, 2024
* Add geodataframe to centroids: first commit

* Remove raster methods, obsolete methods, and fix some.

* Remove test for removed methods

* Fix from geodataframe method

* Remove set meta to lat lon

* Update all I/O methods

* Remove area pixel

* Add method to compute the area per pixel (not attribute)

* Make get_area_pixel

* Remove not often used dist_coast and elevation

* Remove unused meta

* Change Centroids call to new signature

* Remove vector - raster hazard methods

* Restore legacy excell reader

* Remove clear method

* Remove clear test

* Update read/write centroids in hazard

* Support empty on land or region id

* Check for empty region id

* Set region id correctly

* Remove scheduler

* Add docstring

* Add legacy read hdf5

* Replace incorrect dot product

* Remove meta in assign centroids

* Remove matlab test file

* Update set lat/lon

* Update test with init

* Add note on method

* Add from_meta class method

* Update centroids init

* Update read raster intensity / fraction data

* undo changelog duplications

* Update test for points outside of raster within threshold

* Docstring and cosmetics

* Add to default crs method

* Fix legacy from hdf5 for empty extra values

* Update legacy from_hdf5 to exclude 'latitude'/'longitude' from extra

* Update centroids.select to work properly with mask and indices

* Fix forecast translate bug

* Remove _set_centroids

* Remove reproject raster

* Remove test hazard raster

* Update naming for plot fraction centroids

* Replace HAZ_DEMO_MAT with HAZ_TEST_TC

* Replace mat file with hdf5 test file

* Rewrite write raster method for hazard

* Add a default for sel_cen in mask

* Correct indent typos

* Change duplicate test function name

* Replace mat file with hdf5

* Remove unused mat file import

* Add legacy code to read old centroid hdf5 files.

* Update access to centroids dist coast in gdf

* test_trop_cyclone: get test centroids from data api

* Add method to estimate meta raster from centroids

* Remove unecessary conversion to ne_geom

* Add good lat/lon values for unit test

* centroids.centr.write_hdf5: eliminate side effect

* centroids.centr.write_hdf5: elimination of side effect futile for pandas>=2.1

* centroids.test.TestCentroidsFuns.test_select_pass: adapt to downsized LAT/LON arrays

* test_vec_ras: fix TestCentroids.test_centroids_check_pass

* test_vec_ras: remove TestReader.test_write_read_points_h5 as it is redundant
to and covered by test_centr.TestCentroidsWriter.test_read_write_hdf5

* fix typo in from_exposures method

* centroids.centr.from_csv: write class method

* centroids.centr.from_csv: correct order of xy

* centroids.centr.from_excel: update method for gdf

* centroids.centr. update csv and excel methods

* Update tests in test_vec_ras

* Update more tests and remove some with meta

* centroids.centr.from_excel fix columnn indexing

* Add possibility to NOT recenter crs for countires.

This lead to a bug in the assign region id method because longitude of the countries geometry was shifted, but not the entry lat/lon points.

* Add to_csre convenience method

* Prepare the stage for setting other than admin0 region id

* Remove uneeded import

* cenctroids.centr: fix the obvious

* gencenctroids.centr: change crs of the geodataframe instead of the geometry

* white space cosmetics

* PEP8

* Add solid region id and on land tests

* Auto stash before merge of "feature/centroids_as_gdf" and "origin/feature/centroids_as_gdf"

* Update distance and area tests

* Update set crs in load vector shape file

* centr.Centroids.from_excel: deal with region_id column

* Fix write hazard raster with centroids.get_meta

* Area pixel now correctly uses CEA

* Add actual NE CRS test.

* Make on_land and region_id always part of centroids

* Update docstring

* Update import path

* Add overwrite argument to set region id and on land

* Remove from base grid method

* Fix some linting

* Add some docstrings

* Avoid costly computation in from geodataframe

* Add comment

* Fix typo

* Fix set crs in from geodataframe

* Make properties return arrays and not series

* Remove not needed matlab vars

* Revert return numpy arrays.

* update from excel

* update from_excel and from_csv

* centroids.centr: from_excel fix column renaming and adapt to "new argument names"

* Centroids.from_excel fix: switch key,val in test_storm_europe rather than in centr!

* Return numpy arrays instead of series

* Add some basic tests

* Add tests for meta

* Add tentative changelog.

* Make keywords argument only : name lat/lon : remove set_*

* Auto stash before revert of "Make keywords argument only : name lat/lon : remove set_*"

* remove file

* Fix exposures method test

* Fix setting region id and on land at init.

* Fix typo

* Fix all close values for big numbers

* Improve cosmetics.

* update docstrings, doc cosmetics

* update from_csv method, add test

* remove DEF_VAR_CSV

* Update from_excel method and test

* remove unnecessary import

* fix storm_europe tests

* improve csv and excel tests

* Apply suggestions from code review

Co-authored-by: Lukas Riedel <34276446+peanutfun@users.noreply.github.com>

* add excel, csv write methods and tests

* Replace gdf with _gdf

* Revert "Replace gdf with _gdf"

This reverts commit c7e6ffd.

* Update to_crs methods

* Raise error if wrong exposures

* Make from geodataframe more restrictive.

* Use consistently to_crs method

* Improve read centroids

* Add literal to import

* Update to crs with inplace argument

* improve excel, csv method, test based on review

* Update typing

* Fix equal centroids for different gdf columns ordering

* Make consistent gdf column ordering

* Simplify from excel/csv

* Add support for legacy hazard excell

* Avoid single column loading error for df

* fix path in write_excel, write_csv

* fixing docstrings in centr.py

* Update code cosmetics

* Fix typo

* Allo for kwargs in init and update from exposures

* code linting

* Centroids: test from_meta

* Centroids: refined tests

* Centroids: fix from_exposures

* Centroids: implement get_pixel_shapes

* Centroids: code linting and docstrings

* hazard.test.test_base: remove undefined classes from main

* hazard.base: remove unused imports

* hazard.base: clean up imports

* cosmetics

* hazard.base.write_htdf5: add inline comment.

* Centroids: fix _gdf_from_legacy_hdf signature

* Centroids._gdf_from_legacy_hdf5: static, not classmethod

* hazard.centroids.centr: pylint

* centroids.centr : fix pydoc typos and pylint

* climada.hazard.base: pylint

* hazard.centroids.centr: pylint

* climada.test.test_calicbration: use hzard test file with hdf5 format

* Update climada/hazard/centroids/test/test_vec_ras.py

Co-authored-by: Emanuel Schmid <51439563+emanuel-schmid@users.noreply.github.com>

* Hazard: fix write_raster, and some docstrings

* test.test_calibration: fix test file name

* test_api_client: added test for basic centroids plotting
(just because it fails in PR CLIMADA-project#787)

* centr.Centroids.plot: all changes reverted
(with the exception of the obsolete `self.set_meta_to_lat_lon()`)

* Hazard tutorial: use centroids.get_meta() instead of the attribute meta

* Hazard tutorial: use Centroids() instead of from_lat_lon()

* hazard tutorial: update Hazard from raster section

* hazard tutorial: fix dist_coast access

* climada.hazard.Hazard: remove broken vector file support

* split hazard.base module into base, base_io and base_plot

* test_base_xarray: logs format changed due to "sub-modules"

* climada.hazard.HazardIO: remove broken vector file support

* hazard.base_io: remove from_mat, move reproject_vector back to base

* rename test_base_xarray

* hazard.test: move TestReadExcel and TestReadHDF5 to test_base_io

* climada.hazard.test.base/_io: fix imports

* climada.hazard.centr: add deprecated methods section

* fix from_tracks method, remove side effect

* climada.hazard.trop_cyclone.TropCytlone.from_tracks: option for predefined dist_coast in centroids

* TropicalCyclone.from_tracks pydoc

* white space cosmetics

* CHANGELOG: list deprecated and removed methods

* Refactor and add to centroids testing

* trop_cyclone.TropCyclone.from_tracks: fix dist_coast type

* climada.hazard.Centroids: raise exception in deprecated methods that really _are_ failing now

* redo lost commits on hazard.base

* changelog: list added Centroids methods

* hazard.centr: depreecated methods: pydoc string instaed of pass

* fixing geodataframe issues

* Revert "fixing geodataframe issues"

This reverts commit d6d2859.

* Changelog and module renaming

* Changelog and module renaming

* module renaming

---------

Co-authored-by: Chahan Kropf <chahan.kropf@posteo.com>
Co-authored-by: Sarah Hülsen <49907095+sarah-hlsn@users.noreply.github.com>
Co-authored-by: Igor Detring <igor.detring@dwd.de>
Co-authored-by: Chahan M. Kropf <chahan.kropf@usys.ethz.ch>
Co-authored-by: Lukas Riedel <34276446+peanutfun@users.noreply.github.com>
Co-authored-by: Thomas Vogt <thomas.vogt@pik-potsdam.de>
Co-authored-by: Thomas Vogt <57705593+tovogt@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants