-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add year filter functionality CDL #1337
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs more coverage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remaining issues are very trivial. Feel free to merge this if you want to focus on different things.
@@ -363,15 +373,30 @@ def _verify(self) -> None: | |||
RuntimeError: if ``download=False`` but dataset is missing or checksum fails | |||
""" | |||
# Check if the extracted files already exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very unimportant but I wonder if we should instead have a single for-loop over all versions and only download/extract the exact versions we need to. Imagine a situation where someone downloaded and extracted 2021, but then 2022 came out and they wanted to use that too. At the moment, Spack would try to re-download (torchvision will skip this for us) and re-extract the zip file for 2021 even though it's already extracted.
@@ -43,23 +43,23 @@ class CDL(RasterDataset): | |||
is_image = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the glob/regex will still pick up versions other than the ones requested if they are found on the system. You could fix this by editing the filename_regex
during __init__
. We do something similar to this in our Landsat datasets where we patch filename_glob
based on the bands requested.
This PR adds
years
argument to select a number of years for which to download CDL or use CDL so the user doesn't have to download all 15 years. This is how we decided to implement it for NLCD as well.