Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL4EO Landsat Downstream Dataset/module CDL, NLCD #1338

Merged
merged 28 commits into from
May 25, 2023

Conversation

nilsleh
Copy link
Collaborator

@nilsleh nilsleh commented May 15, 2023

This PR adds datasets and module for downstream evaluation of SSL methods.

Sensors:

  • L7-L1
  • L7-L2
  • L8-L1
  • L8-L2

Masks:

  • CDL
  • NLCD

This is a NonGeoDataset relying on datasets to be created with the help of #1336.

Example CDL TM-TOA:

Screenshot from 2023-05-24 22-33-21

@github-actions github-actions bot added datasets Geospatial or benchmark datasets testing Continuous integration testing labels May 15, 2023
@nilsleh nilsleh marked this pull request as draft May 15, 2023 12:16
@adamjstewart adamjstewart added this to the 0.5.0 milestone May 15, 2023
@nilsleh nilsleh marked this pull request as ready for review May 23, 2023 18:23
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added the datamodules PyTorch Lightning datamodules label May 23, 2023
tests/data/ssl4eo_downstream_landsat/data.py Outdated Show resolved Hide resolved
tests/data/ssl4eo_downstream_landsat/data.py Outdated Show resolved Hide resolved
tests/data/ssl4eo_downstream_landsat/data.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_downstream_landsat.py Outdated Show resolved Hide resolved
tests/conf/ssl4eo_l_benchmark_nlcd.yaml Outdated Show resolved Hide resolved
tests/conf/ssl4eo_l_benchmark_cdl.yaml Outdated Show resolved Hide resolved
tests/datasets/test_ssl4eo_benchmark_landsat.py Outdated Show resolved Hide resolved
tests/datasets/test_ssl4eo_benchmark_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_benchmark_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_benchmark_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_benchmark_landsat.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_benchmark_landsat.py Outdated Show resolved Hide resolved
torchgeo/datamodules/ssl4eo_benchmark.py Outdated Show resolved Hide resolved
torchgeo/datamodules/ssl4eo_benchmark.py Outdated Show resolved Hide resolved
torchgeo/datasets/cdl.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_benchmark.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_benchmark.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_benchmark.py Show resolved Hide resolved
torchgeo/datasets/ssl4eo_benchmark.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_benchmark.py Outdated Show resolved Hide resolved
torchgeo/datasets/ssl4eo_benchmark.py Show resolved Hide resolved
"""
super().__init__(SSL4EOLBenchmark, batch_size, num_workers, **kwargs)

self.train_aug = AugmentationSequential(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use the same aug for all splits, you can just set self.aug instead. However, we might want different augs for train to improve performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wrote them out because I thought it remains to be discussed whether we have additional training augs.

torchgeo/datasets/ssl4eo_benchmark.py Outdated Show resolved Hide resolved
)
exists.append(bool(glob.glob(mask_pathname, recursive=True)))
if all(exists):
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put a blank line between sections

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant between the return statement and the comment below

torchgeo/datasets/ssl4eo_benchmark.py Outdated Show resolved Hide resolved
Comment on lines +339 to +344
plt_cmap = ListedColormap(
np.stack(
[np.array(val) / 255 for val in self.cmaps[self.mask_product].values()],
axis=0,
)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to say this during the CDL review, but don't we also need to map this to 134 classes before plotting? Should have asked you to add an example plot for CDL.

Copy link
Collaborator Author

@nilsleh nilsleh May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, now I do get what you mean. Yeah you are right.

Copy link
Collaborator Author

@nilsleh nilsleh May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In NLCD the cmap is already done with the ordinal values. I can fix it in CDL in another PR.

Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could nitpick for days but I think it's good enough now. Would rather merge it soon so people can start benchmarking. Feel free to merge if you also think it's good enough or address the remaining comments and I'll review again.

@nilsleh nilsleh merged commit 22ee952 into microsoft:main May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodules PyTorch Lightning datamodules datasets Geospatial or benchmark datasets testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants