Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/fix nightlight test #716

Merged
merged 2 commits into from
May 10, 2023
Merged

Conversation

NicolasColombi
Copy link
Collaborator

@NicolasColombi NicolasColombi commented May 8, 2023

Changes proposed in this PR:

  • Create .tif.gz file with a mock .tif image in it to run the test: climada/test/test_nightlight/test_load_nighlight_noaa, after the test is executed, the image is automatically delated.
  • the test failed because of a missing file in the jenkins environment

This PR fixes

  • A failing integration test load_nightlight_noaa

PR Author Checklist

PR Reviewer Checklist

The test was failing the integration check on jenkins.
Create fake .tif.gz file, execute test, delate.tif.gz file.
@NicolasColombi NicolasColombi marked this pull request as ready for review May 8, 2023 13:25
@emanuel-schmid emanuel-schmid merged commit 15e3904 into develop May 10, 2023
@emanuel-schmid
Copy link
Collaborator

🙌

@emanuel-schmid emanuel-schmid deleted the feature/fix_nightlight_test branch May 10, 2023 07:01
@emanuel-schmid emanuel-schmid restored the feature/fix_nightlight_test branch May 10, 2023 07:29
@emanuel-schmid
Copy link
Collaborator

Thanks, looks quite good, but the test failed on my computer.

  1. tifffile is not part of the default installation. we'd need to request it, e.g., in env_develop.yml.
  2. the test fails nevertheless:
FAIL: test_load_nightlight_noaa (climada.test.test_nightlight.TestNightlight)
Test that data is downloaded if not present in SYSTEM_DIR,
----------------------------------------------------------------------
Traceback (most recent call last):
  File "climada_python\climada\test\test_nightlight.py", line 184, in test_load_nightlight_noaa
    self.assertIn('F182013.v4c_web.stable_lights.avg_vis.tif',str(fn_light))
AssertionError: 'F182013.v4c_web.stable_lights.avg_vis.tif' not found in '~\\climada\\data\\F182013.v4c_web.stable_lights.avg_vis.p'
  1. As the "real data" has such patterns and we probably don't want to mess with it, I'd suggest to use something more exotic than 'F18'. Would 'E99' also work?
  2. In order to make the patterns more obvious, I'd suggest to use formatted strings, e.g.,
pattern = f"{sat_name}{year}.v4c_web.stable_lights.avg_vis."`
img = f"{pattern}.tif.gz"
pfile = f"{pattern}.p"

@NicolasColombi
Copy link
Collaborator Author

NicolasColombi commented May 11, 2023

Hi Emanuel, thanks for the suggestions! I might have an idea of why this happens.
If no arguments are provided to load_nighlight_noaa() the function does the following:

  1. Looks for a file with the pattern: '*'2013'*'.stable_lights.avg_vis.p in SYSTEM_DIR. If the file is present the function will return the file name that match the pattern. Example: F182013.v4c_web.stable_lights.avg_vis.p In this example the file present in the folder had satellite F18.

  2. If 1. doesn't apply, it looks for a file with the same pattern but with extention .tif.gz. If it finds one, it uses unzip_tif_to_py() to unzip it and save it as pickle. At this point unzip_tif_to_py() doesn't return the file_name with extension .p but the filename with extension .tif. Consequently so does load_nighlight_noaa().

Second point is that 2) is executed only if 1) is not, meaning that if in the folder there are both a .p and .tif.gz file, load_nighlight_noaa() will run 1) and return the file name ending with .p. What I suspect is that in the Jenkins environment there is already a file F182013.v4c_web.stable_lights.avg_vis.p, meaning that even if we create a fake .tif.gz on the fly in the test, point 2) never gets executed.

To solve this I would suggest two things:

  1. make unzip_tif_to_py() return the file name of the actual .pfile it saved, instead of .tif. But I would like to double check that with you @emanuel-schmid

  2. check if a .p file already exist in SYSTEM_DIR on the Jenkins environment. If yes, I would delate it.

using E99 instead of F18 works if the file is present in the folder (if we create it and put it there), if the file is not present load_nightlight_noaa will download it, but the satellite name on the noaa website are F10-F18. But for our use case it works 👍

@emanuel-schmid
Copy link
Collaborator

Hi @NicolasColombi Great analysis, many thanks!
Why both unzip_tif_to_py and load_nightlight_noaa even bother to return that filename is beyond me.
Obviously it is not meant for accessing the file later on, since in the .tif case the file is not even present anymore.
I suppose that information may be useful for those who are in the process of creating litpop files. I think that consumes a lot of time and resources. The pickle file is big and fast while the tif.gz file is small and slow. So for coordination it may be good to know where your data comes from... 🤔
Bottom line: I rather wouldn't change the behavior of neither method. There may be workflows that would need adaptions afterwards.

But yes, you are right, there is a file F182013.v4c_web.stable_lights.avg_vis.p on Jenkins. Well - there was one. I deleted it.

@emanuel-schmid
Copy link
Collaborator

Let's continue the discussion in the new PR #718

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants