Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data module for LEVIR-CD+ dataset #1707

Merged
merged 36 commits into from
Nov 1, 2023

Conversation

robmarkcole
Copy link
Contributor

Close #1706

@github-actions github-actions bot added testing Continuous integration testing datamodules PyTorch Lightning datamodules labels Oct 31, 2023
@adamjstewart adamjstewart changed the title WIP: Issue 1706 Add data module for LEVIR-CD+ dataset Oct 31, 2023
@adamjstewart adamjstewart added this to the 0.6.0 milestone Oct 31, 2023
@github-actions github-actions bot added the datasets Geospatial or benchmark datasets label Oct 31, 2023
@robmarkcole
Copy link
Contributor Author

robmarkcole commented Oct 31, 2023

Error on cicd: RuntimeError: The daily quota of the file LEVIR-CD+.zip is exceeded and it can't be downloaded. This is a limitation of Google Drive and can only be overcome by trying again later.

After updating the image type to float the plotting broke, now addressing that too

@adamjstewart
Copy link
Collaborator

No tests should require internet access, you should use the fake data in tests/data/levircd. You may have to extract the zip file and commit all individual files so that pytest doesn't create new files that aren't tracked by git.

@robmarkcole
Copy link
Contributor Author

We format the mask as torch.long but loss calc requries float - why do we not format it as such?

@robmarkcole
Copy link
Contributor Author

robmarkcole commented Nov 1, 2023

Test fails with

FAILED tests/datamodules/test_levircd.py::TestLEVIRCDPlusDataModule::test_val_dataloader[train] - 

torchgeo.datamodules.utils.MisconfigurationException: LEVIRCDPlusDataModule.val_dataset has length 0.

which is why the plot test fails also

@adamjstewart
Copy link
Collaborator

We format the mask as torch.long but loss calc requries float - why do we not format it as such?

Kornia augmentations require it to be torch.long. Really wish the Kornia and torchmetrics devs would talk to each other.

@adamjstewart
Copy link
Collaborator

Test fails with

FAILED tests/datamodules/test_levircd.py::TestLEVIRCDPlusDataModule::test_val_dataloader[train] - 

torchgeo.datamodules.utils.MisconfigurationException: LEVIRCDPlusDataModule.val_dataset has length 0.

which is why the plot test fails also

You'll need to set val_split_pct=0.5 during testing, the fake data only has 2 images so 20% of 2 is 0.

@robmarkcole
Copy link
Contributor Author

Appears an empty image is passed in the test..

torchgeo/datasets/levircd.py:240: in plot
    image1 = get_rgb(sample["image1"])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

img = tensor([[[0., 0., 0.]]])

    def get_rgb(img: Tensor) -> "np.typing.NDArray[np.uint8]":
        img = img.permute(1, 2, 0)
        rgb_img = img.float().numpy()
        per02 = np.percentile(rgb_img, 2)
        per98 = np.percentile(rgb_img, 98)
>       rgb_img = (np.clip((rgb_img - per02) / (per98 - per02), 0, 1) * 255).astype(
            np.uint8
        )
E       RuntimeWarning: invalid value encountered in divide

@adamjstewart
Copy link
Collaborator

Could also create fake images with non-zero values, but dividing by epsilon is more robust.

@adamjstewart
Copy link
Collaborator

P.S. We actually need a ChangeDetectionTask for another project we're working on. If you have a working example, it would be good to open a PR and iterate on that.

@robmarkcole
Copy link
Contributor Author

Re ChangeDetectionTask that's next on my list - wanted a couple of datasets to eval on first :-)

torchgeo/datamodules/levircd.py Outdated Show resolved Hide resolved
torchgeo/datasets/levircd.py Outdated Show resolved Hide resolved
robmarkcole and others added 3 commits November 1, 2023 14:36
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
@robmarkcole
Copy link
Contributor Author

The suggested change resulted in

Returning Any from function declared to return "ndarray[Any, dtype[unsignedinteger[_8Bit]]]"  [no-any-return]

@adamjstewart
Copy link
Collaborator

Yep, looks like it doesn't work

@adamjstewart
Copy link
Collaborator

Can you upload a new example plot to the PR description so we can make sure it looks okay?

@robmarkcole
Copy link
Contributor Author

Updated plotting:

image

@robmarkcole
Copy link
Contributor Author

Added cast back in and now get

torchgeo/datasets/levircd.py:239: error: Redundant cast to "ndarray[Any, dtype[unsignedinteger[_8Bit]]]"  [redundant-cast]

@robmarkcole
Copy link
Contributor Author

Latest effort results in

torchgeo/datasets/levircd.py:230: error: "ndarray" expects 2 type arguments, but 1 given  [type-arg]
torchgeo/datasets/levircd.py:236: error: Returning Any from function declared to return "ndarray[Any, Any]"  [no-any-return]

torchgeo/datasets/levircd.py Outdated Show resolved Hide resolved
torchgeo/datasets/levircd.py Outdated Show resolved Hide resolved
@adamjstewart adamjstewart merged commit 1fb0301 into microsoft:main Nov 1, 2023
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodules PyTorch Lightning datamodules datasets Geospatial or benchmark datasets testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FR LEVIRCDPlusDataModule
2 participants