-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add script to apply ESA quality mask #14
Conversation
Checking the quality mask for every mask could be very slow, but there is no need to do that. The quality mask is to be applied only if the metadata indicates data loss. For example, the following xml file So we can grep "There is data loss in this tile" on GENERAL_QUALITY.xml. PS: I don't know why the xml markup tags were removed once I posted the comment. But the message remains. |
Thanks for the great suggestion and background information @junchangju! It sounds like it will save a lot of time so I'll add that in as a pre-check. I found this metadata in the examples I've been using and saw that it describes the impacted bands as ...
<Earth_Explorer_File>
<Data_Block>
<report>
<checkList>
...
<check>
<inspection creation="2022-11-21T07:37:42.488Z" duration="11940" execution="2022-11-21T07:37:42.495Z" id="Data_Loss" item="S2A_OPER_MSI_L1C_TL_ATOS_20221121T053331_A038726_T45TXF_N04.00" itemURL="/mnt/nfs-l1-02/l1-processing/processing-id-17769-2022-11-21-06-03-12/TaskTable_20221121T060312/FORMAT_METADATA_TILE_L1C_ADAPT/output/PDI_DS_TILE_LIST/S2A_OPER_MSI_L1C_TL_ATOS_20221121T053331_A038726_T45TXF_N04.00/" name="Check TECQUA for data loss " priority="5" processingStatus="done" status="FAILED"/>
<message contentType="text/plain">There is data loss in this tile</message>
<extraValues>
<value name="Affected_Bands">B01 B02 B03 B04 B05 B06 B07 B08 B09 B10 B11 B12 B8A</value>
</extraValues>
</check>
</checkList>
</report>
</Data_Block>
</Earth_Explorer_File> |
Ah actually it's not always all of the bands. I had this in the 3rd example I checked, <check>
<inspection creation="2023-03-05T08:36:45.870Z" duration="23651" execution="2023-03-05T08:36:45.877Z" id="Data_Loss" item="S2B_OPER_MSI_L1C_TL_2BPS_20230305T072429_A031305_T43QDG_N05.09" itemURL="/mnt/local/l1ctile/work/3be743a1-28c3-4010-84c9-8ccca1586310/ipf_output_L1CTile_20230305082123/L1CTile/TaskTable_20230305T082127/FORMAT_METADATA_TILE_L1C/output/PDI_DS_TILE_LIST/S2B_OPER_MSI_L1C_TL_2BPS_20230305T072429_A031305_T43QDG_N05.09/" name="Check TECQUA for data loss " priority="5" processingStatus="done" status="FAILED"/>
<message contentType="text/plain">There is data loss in this tile</message>
<extraValues>
<value name="Affected_Bands">B01 B02 B03 B09 B10 B11 B12 B8A</value>
</extraValues>
</check> |
Then loop over all bands should be fine. |
Closing this PR because it isn't running tests in CI. I now have access to this repo and can submit a PR from a branch. See #15 for followup |
What I'm changing
This PR is intended to address this ticket, NASA-IMPACT/hls-sentinel#155
Specifically this is intended to replace a utility written in C to apply a binary quality mask to Sentinel-2 L1C data that masks lost or degraded MSI pixel values. The original PR was, NASA-IMPACT/hls-sentinel#152
This utility script is intended to fit into the rest of the HLS Sentinel-2 related processing. To apply the script we would need to do a few extra steps,
hls-utilities@v1.10
)hls-utilities
in thehls-sentinel
container to include this update (https://github.com/NASA-IMPACT/hls-sentinel/blob/20180e0d0cd286ffefe61bd6aeeaeff056874102/Dockerfile#L123)hls-sentinel
processing steps to include running this script as part ofsentinel_granule.sh
(e.g., around here https://github.com/NASA-IMPACT/hls-sentinel/blob/20180e0d0cd286ffefe61bd6aeeaeff056874102/scripts/sentinel_granule.sh#L44)hls-sentinel
image used in our production pipelines by tagging our new container appropriately (i.e., "latest")How I did it
The ESA quality mask is an 8 band single bit image that encodes 0/1 for 8 quality related attributes. The bands we're interested in are bands 3 & 4 (counting from 1 in GDAL notation) which encode if the pixel retrieval was either lost or degraded.
I followed the original C code implementation with a few modifications...
04.00
).HLS_REFL_FILLVAL
(-9999
) to any pixel where the mask was TRUE for either band.-9999
value is not suitable for us because the original L1C images in JPEG2000 format use0
as the nodata value (since-9999
is out of data type range)0
value is used as a no data value for L1C imagery and this is noted in the metadatahls-sentinel
that some utilities modify in place while others create new data. If it'd be useful for a "debug mode" it would be easy to rewrite the images to new filesHow you can test it
Install the test dependencies and run unit tests for this script,
The unit tests are what helped me discover that JPEG2000 driver will apply lossy compression when we update the file (
r+
access mode) even though the original file had lossless compression in creation options 🙃I ran this for one of the granules mentioned in the original C code PR and recorded a tiny demo,
Note the "inspect tool" that shows the imagery values on the right hand side. The S2 L1C imagery don't have a No Data Value set, so QGIS shows the 0 as valid pixels. Still we can see the individual pixel values are correct (0 instead of ~1,000) and the image is stretched differently because of the different minimum values.