Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor clock drift #84

Merged
merged 10 commits into from
Sep 11, 2020
Merged

Conversation

carloshorn
Copy link
Collaborator

@carloshorn carloshorn commented Sep 2, 2020

This PR closes #76 and closes #80 and the associated PRs: closes #81 and closes #82.

Enhancements

  1. By adding more comment lines and proper variable names, the code should be more comprehensible.
  2. By making reader.times a property, there is no need to synchronize the array with reader.utcs anymore. This makes sense, because both arrays contain the same information but expressed in different data types.

Fixes

  1. It turned out that IndexError in PODReader._adjust_clock_drift #76 was a consequence of an error in computing the maximum line number of shifted line indices.

    max(self.scans["scan_line_number"] - min_idx)) + 1

    The - max_idx is wrong, see discussion in Fix clock drift rounding error #82.

  2. This PR takes care of skipped scan lines which addresses the following TODO:

    TODO: bad things might happen when scanlines are skipped.

    by taking into account the complete line number range when sub-setting the missed lines, instead of only considering the shifted lines as in the current implementation:

    pygac/pygac/pod_reader.py

    Lines 446 to 448 in 5c59f2b

    missed = sorted((set(line_indices) |
    set(line_indices + 1))
    - set(self.scans["scan_line_number"]))

  3. The current implementation computes wrong utc values for missed scan lines if the first scan line number is not equal to one, see

    pygac/pygac/pod_reader.py

    Lines 463 to 464 in 5c59f2b

    missed_utcs = ((np.array(missed) - 1) * np.timedelta64(500, "ms")
    + self.utcs[0])

    The correct time transformation is:

missed_utcs = ((missed - self.scans["scan_line_number"][0])*np.timedelta64(500, "ms")
               + self.utcs[0])
  1. The computed longitudes and latitudes for the slerp interpolation should correspond to the utcs before clock adjustment. So the time difference between each line in the interpolation array is equal to the scan rate of 500 ms. Therefore, it should be clock_drift_adjust=False in the following call:

    pygac/pygac/pod_reader.py

    Lines 466 to 468 in 5c59f2b

    mlons, mlats = self.compute_lonlat(width=self.lats.shape[1],
    utcs=missed_utcs,
    clock_drift_adjust=True)

    otherwise, the values in complete_lons, complete_lats are not necessarily in the correct order.

  2. It corrects Reduce rounding error in POD reader adjust clock drift #80, by making reader.times a property and taking into account milliseconds in the final assignment

    self.utcs += offsets.astype('timedelta64[s]')

    Furthermore, note that this line is also incorrect, because after
    offsets *= -2

    the variable offsets has the wrong units (1/scan_rate = 1/(0.5 seconds/line) = 2 lines/seconds)

Potential Side Effects:

  1. This PR removes the method Reader.compute_lonlat, see
    def compute_lonlat(self, width, utcs=None, clock_drift_adjust=True):

    It is not considered as necessary for the public interface of Reader, because it is only called for clock drift adjustment, where the calling method reader._adjust_clock_drift is private already.

@carloshorn
Copy link
Collaborator Author

I noticed that there were not tests yet regarding the clock drift adjustment.
I would postpone the creation of tests after the first code review.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this. I would have liked to see tests for the changed functionality or bugfixing before the refactoring took place, it would ensure that the refactoring is just a refactoring, ie that all the tests are still passing.
No need to delay tests any further now though, go ahead.

Comment on lines 424 to 425
# TODO: Are we sure all satellites have this scan width in degrees ?
# TODO: Does this also work for LAC?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good questions, they should be answered before we merge this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure all satellites have this scan width in degrees ?

According to Table J-1 (Satellite Scanning Instrument Parameters for TIROS-N through NOAA-14) in the KLM guide, the max scan angle for AVHRR/2 is given by 55.37 deg. I don't know where the 55.385 deg in the code come from, but the 55.37 deg is also the default in pyorbita..geoloc_instrument_definitions.avhrr_gac.

Does this also work for LAC?

No, according to Table 8.3.1.3.1-1. (LAC/HRPT Data Characteristics.) the scan rate for LAC/HRPT is 360 scans per minute => 1/6 seconds/line instead of the 1/2 seconds/line which is currently assumed in PODReader._adjust_clock_drift. This can be fixed by making the scan rate a LAC/GAC reader attribute. This should be passed as frequency argument to pyorbita..geoloc_instrument_definitions.avhrr_gac.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that the GAC/LAC readers already have the attribute reader.scan_freq... I will definitely use it!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Table J-1 (Satellite Scanning Instrument Parameters for TIROS-N through NOAA-14) in the KLM guide, the max scan angle for AVHRR/2 is given by 55.37 deg. I don't know where the 55.385 deg in the code come from, but the 55.37 deg is also the default in pyorbita..geoloc_instrument_definitions.avhrr_gac.

I will have to check, but I think this was calculated somehow. It's somewhere in pyorbital maybe.

Comment on lines -439 to +441
return (((year - 1970).astype('datetime64[Y]')
+ (jday - 1).astype('timedelta64[D]')).astype('datetime64[ms]')
return (year.astype(str).astype('datetime64[Y]')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the logic here?

Copy link
Collaborator Author

@carloshorn carloshorn Sep 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually try to avoid hard coded numbers as much as possible. In this case, I did not like that PyGAC hard codes the numpy.datetime64 epoch (1970). I know that there is a very low chance that numpy will ever change the epoch, but still I think PyGAC should not make any assumptions on it.
The shortest way I found to become independent of the epoch was the string conversion, because something like epoch = int(str(np.datetime64(0, 'Y'))) looks ugly to me and numpy also does not have any attribute to access its epoch directly (or at least I could not easily find it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for explaining!

@carloshorn
Copy link
Collaborator Author

I would have liked to see tests for the changed functionality or bugfixing before the refactoring took place, it would ensure that the refactoring is just a refactoring, ie that all the tests are still passing.

Good point for the future, I am not yet much into TDD, but I clearly see the benefit. How can we proceed on it?
I could create a new branch where I implement the test and include the fixes and changes step by step. Or is the now implemented test meaningful enough to show the correctness of this PR?

@mraspaud
Copy link
Member

mraspaud commented Sep 3, 2020

@carloshorn looks go with the test you added! I don't think you need to rewind your work now.

As far as TDD goes, it's just as the name says :) Write a test first (that fails) and then proceed with the implementation so that the test passes. When the test passes, refactor so that the code becomes clean. See a demo of how to do this here: https://www.youtube.com/watch?v=58jGpV2Cg50&t=2628s

The hardest part is to know what to test. My rule of thumb is that every line should be tested, but we test only public functionality, not implementation details. We can talk more on slack about it if you want (in #random for example).

@carloshorn
Copy link
Collaborator Author

According to Table J-1 (Satellite Scanning Instrument Parameters for TIROS-N through NOAA-14) in the KLM guide, the max scan angle for AVHRR/2 is given by 55.37 deg. I don't know where the 55.385 deg in the code come from, but the 55.37 deg is also the default in pyorbita..geoloc_instrument_definitions.avhrr_gac.

I will have to check, but I think this was calculated somehow. It's somewhere in pyorbital maybe.

I am still trying to understand the given scan angle of 55.385 deg which is passed to pyorbital in the current implementation:

pygac/pygac/reader.py

Lines 523 to 524 in 5c59f2b

sgeom = avhrr_gac(utcs.astype(datetime.datetime),
self.scan_points, 55.385)

Unfortunately, I could not find any clear hint in pyorbital. My first guess, that it is the scan midpoint plus half a scan step, did not lead to this number.
Then, the following line got my attention:
https://github.com/pytroll/pyorbital/blob/42768fb839f22fe05254bb98aa8a774199462a2a/pyorbital/geoloc.py#L284
But still, I could not reproduce the 55.385 degrees.

Can you think of anything else to try, or maybe some verbal ideas/explanations why the scan angle needs an adjustment?

@ninahakansson
Copy link
Collaborator

@carloshorn Nice work with the PR! About the 55.385 this might be inherited from the PPS-ahamap package. I found a note from 2002. I can ask KG if he remembers :).

   /*    Replaced with calculations taken from 6S Users' Guide Version 2 /KG */

   for(y=0;y<self->podlac->podHeader.numberOfScans;y++) {
      for(x=0;x<LAC_MXPIX;x++) {
	 scan_angle=55.385*abs(x-1024)/1024;
	 satzenang[y*LAC_MXPIX+x]=asin((1+sat_alt/earth_radius)*sin(scan_angle*DEG_TO_RAD))*RAD_TO_DEG;
      }
   }

@carloshorn
Copy link
Collaborator Author

@ninahakansson, please ask KG. I hope he remembers, otherwise, I would vote for using the default value 55.37 from the KLM Guide, because I would prefer to have a reference/justification/explanation for any assumption/choice/parameter that appears in the code.
If someone remembers or comes up with an explanation later on, we can still open an issue, change back to 55.385 and add a comment line with the explanation.
What do you think?

@mraspaud
Copy link
Member

mraspaud commented Sep 7, 2020

From what I can find, pyorbital is also defaulting to 55.37 for avhrr: https://github.com/pytroll/pyorbital/blob/master/pyorbital/geoloc_instrument_definitions.py#L52-L102

@mraspaud
Copy link
Member

mraspaud commented Sep 7, 2020

Also notice that the scan angle is 55.25 for noaa 16:
https://www.ospo.noaa.gov/data/ppp/notice_files/mar0101.html

@ninahakansson
Copy link
Collaborator

KG had nothing to add and @abhaydd showed me that in one part of the KLM-guide that value is 55.4 and that the value varies between sources, versions of sources and actually between instruments. I have no opinion about updating it or not. As it is a small change PPS will be OK either way.

@carloshorn
Copy link
Collaborator Author

Also notice that the scan angle is 55.25 for noaa 16:
https://www.ospo.noaa.gov/data/ppp/notice_files/mar0101.html

Currently, there is just a clock drift adjustment for POD satellites, but it is truly a thing to remember when adding clock drifts for KLM satellites in the future.

@carloshorn
Copy link
Collaborator Author

@ninahakansson, thanks for checking. In this case, I would stick on one source, which could be Table J-1 (Satellite Scanning Instrument Parameters for TIROS-N through NOAA-14) in the KLM guide and use it to justify the choice in the code.
In the current PR, this is done indirectly by passing the responsibility to the default value of pyorbital.geoloc_instrument_definitions.avhrr_gac, which refers to exactly this table in its doc-string.

@mraspaud
Copy link
Member

mraspaud commented Sep 7, 2020

@carloshorn in principle clock drift adjustment shouldn't be needed as the clock is updated daily on the klm series.

@carloshorn
Copy link
Collaborator Author

@mraspaud, cool, then we are lucky and do not need to implement an extra case for NOAA-16.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for digging into this.

@mraspaud
Copy link
Member

mraspaud commented Sep 7, 2020

@sfinkens I think you need to run your tests on this one

@ghost
Copy link

ghost commented Sep 7, 2020

Congratulations 🎉. DeepCode analyzed your code in 2.218 seconds and we found no issues. Enjoy a moment of no bugs ☀️.

👉 View analysis in DeepCode’s Dashboard | Configure the bot

@carloshorn
Copy link
Collaborator Author

@mraspaud, sorry I have not noticed that you were about to approving this. I went through the code a last time for my-self.

@mraspaud
Copy link
Member

mraspaud commented Sep 7, 2020

Retry DeepCode

@sfinkens
Copy link
Member

sfinkens commented Sep 8, 2020

Awesome, now I'm able to understand what's happening 😄 Thanks for digging into this! I'll run the tests ASAP.

@mraspaud
Copy link
Member

@sfinkens did you run the tests on this one too?

@sfinkens
Copy link
Member

sfinkens commented Sep 11, 2020

I'm on it. There are differences, as expected, but I think not critical. Currently hunting a memory leak occuring while plotting the differences...

@sfinkens
Copy link
Member

Regression tests show minor differences for the angles, which is expected because the scanline timestamps were corrected.

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_sun_sensor_azimuth_difference_angle

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_solar_zenith_angle

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_solar_azimuth_angle

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_sensor_zenith_angle

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_sensor_azimuth_angle

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_reflectance_channel_3

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_reflectance_channel_2

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_reflectance_channel_1

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_longitude

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_latitude

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_brightness_temperature_channel_5

AVHRR-GAC_FDR_1C_N07_19850129T193012Z_19850129T211513Z_R_O_20200101T000000Z_0100 nc_brightness_temperature_channel_4

@mraspaud mraspaud merged commit aa841ef into pytroll:master Sep 11, 2020
@mraspaud
Copy link
Member

Thanks for the hard work everyone!

@carloshorn carloshorn deleted the refactor_clock_drift branch September 11, 2020 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce rounding error in POD reader adjust clock drift IndexError in PODReader._adjust_clock_drift
5 participants