run tests on all state/year combinations #30
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I generated STAC items for every unique state/year combination available in the NAIP bucket. This was a total of 228 runs.
Issues I found during testing:
If it can't find the resource description and date in the metadata files, the code attempts to extract it from the COG href. In order to get the date from the COG href, it uses a regex. Most of the COG's have a name with the format
m_3510264_ne_13_060_20200905.tif
, but some of them have it asm_4209601_ne_14_060_20180912_20181211.tif
with an extra 8-number sequence at the end of the name. In these cases, the actual date for the scene is always the first set of 8 characters (in this case,20180912
) and not the second. The regex was modified to take this into account with an optional clause at the end.Some of the XML metadata files from the year 2020 do not contain the xpath
gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:title/gco:CharacterString
but instead contain the xpathidinfo/citation/citeinfo/title
for the resource description field. This was added as a fallback if the longer xpath was not found. Most of the metadata files contain the longer xpath and only a handful contain the shorter one.The logic for extracting the resource description and date for scenes prior to 2020, if no resource description and date are found by looking through the associated metadata file, was made to use the same common method (
maybe_extract_id_and_date
) that the other cases use.