Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Spacenet 1: Building Detection v1 #129

Merged
merged 10 commits into from
Sep 15, 2021
Merged

Conversation

ashnair1
Copy link
Collaborator

@ghost
Copy link

ghost commented Sep 12, 2021

CLA assistant check
All CLA requirements met.

Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

For the tests, can you replace the example images with fake images? See https://github.com/microsoft/torchgeo/tree/main/tests/data#raster-data for details on how to create them. We can't redistribute dataset files without worrying about licensing, and larger files mean slower downloads.

torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
@adamjstewart adamjstewart added the datasets Geospatial or benchmark datasets label Sep 12, 2021
Copy link
Collaborator

@isaaccorley isaaccorley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor suggestions. Also don't forget to run black, flake8, mypy, isort, and pydocstyle and fix any resulting errors so that the checks will pass.

torchgeo/datasets/__init__.py Outdated Show resolved Hide resolved
torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
torchgeo/datasets/spacenet1.py Outdated Show resolved Hide resolved
@ashnair1 ashnair1 force-pushed the spacenet branch 2 times, most recently from 704c291 to f07741e Compare September 13, 2021 11:22
@ashnair1
Copy link
Collaborator Author

ashnair1 commented Sep 13, 2021

Thanks for the PR!

For the tests, can you replace the example images with fake images? See https://github.com/microsoft/torchgeo/tree/main/tests/data#raster-data for details on how to create them. We can't redistribute dataset files without worrying about licensing, and larger files mean slower downloads.

I've created the dataset using the specified instructions though I had to hack together a script to do so. Incidentally, I would like to discuss the test dataset generation process but I'll raise a separate issue for that.

@ashnair1
Copy link
Collaborator Author

ashnair1 commented Sep 13, 2021

Currently the three remaining issues are those concerning:

  • __getitem__
  • Root data folder
  • 8Band and RGB

I'm looking into __getitem__ and root folder issues now and will update the PR when it's done.

Regarding issue 3 (which is also related to the creation of a Spacenet meta class ), this is dependent on how much we would like to control what data torchgeo makes available to the users.

Other spacenet datasets also contain multiple imagery. For example, spacenet 2, 4 and 5 provide MS (Multispectral), PAN (Panchromatic), PS-MS (Pansharpened Multispectral) and PS-RGB (Pansharpened RGB , though this can be created from PS-MS). Spacenet 6 provides SAR and EO imagery. Spacenet 7 is a time series dataset. Some spacenet datasets are more similar than others but the quality of having more than one input image is pretty consistent.

I'm not too familiar with Spacenet 6 and 7 but for the rest, people generally tend to use the PS-RGB when available since that's most compatible with pretrained backbones.

So we can either provide a subset of the data i.e. the one image that is commonly used in all of the winning solutions or all of it. The former will make implementation easier (probably make a single spacenet file for all the datasets) but the latter is a more true representation of the full dataset.

I'm open to either option.

@adamjstewart
Copy link
Collaborator

We discussed this in a meeting today and it seems like the easiest way to move forward would be to keep this as a VisionDataset with an integer __getitem__ instead of a GeoDataset with bounding box __getitem__. We can always convert it to a GeoDataset later if we really want the ability to combine it with other GeoDatasets in a ZipDataset.

* Create single spacenet.py for all spacenet datasets
* Create single spacenet directory for all spacenet test data
* Create single test_spacenet.py for testing all spacenet datasets
@ashnair1
Copy link
Collaborator Author

Sure, makes sense. I've made the necessary changes.

@ashnair1
Copy link
Collaborator Author

ashnair1 commented Sep 14, 2021

Since Spacenet is currently implemented as a VisionDataset, I've placed it under the Non-geospatial section in the docs even though it has geospatial info. Is this fine?

Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really close now! Can we rename Spacenet to SpaceNet everywhere?

docs/api/datasets.rst Outdated Show resolved Hide resolved
tests/datasets/test_spacenet.py Outdated Show resolved Hide resolved
@calebrob6
Copy link
Member

Just a heads up -- I tested this dataset and found that the "rgb" images come in three different shapes: {(3, 406, 438), (3, 406, 439), (3, 407, 439)} while the "8band" images are 1/4 the size. It looks like this is actually expected -- https://medium.com/the-downlinq/getting-started-with-spacenet-data-827fd2ec9f53 -- but it means that you can't immediately use it with a dataloader (and will instead need to do some resizing in a transform). Do we want to address that anywhere or let people figure it out?

@adamjstewart is the only thing missing here the "Spacenet" --> "SpaceNet" find and replace?

@adamjstewart
Copy link
Collaborator

Since it's only a 1 pixel difference I think we should handle it ourselves.

Yes, the only thing remaining is the rename.

@ashnair1
Copy link
Collaborator Author

Thanks for the review. Rename complete.

@adamjstewart adamjstewart merged commit 0ce0a59 into microsoft:main Sep 15, 2021
@ashnair1 ashnair1 deleted the spacenet branch September 16, 2021 08:55
isaaccorley referenced this pull request in isaaccorley/torchgeo Sep 18, 2021
* Add Spacenet 1

* Add test data

* Style fixes

* Convert Spacenet1 to VisionDataset

* Add option for selecting imagery

* Consolidate spacenet

* Create single spacenet.py for all spacenet datasets
* Create single spacenet directory for all spacenet test data
* Create single test_spacenet.py for testing all spacenet datasets

* Add copyright

* Reorder Spacenet in docs

* Test both rgb & 8band

* Rename Spacenet -> SpaceNet
@adamjstewart adamjstewart mentioned this pull request Sep 28, 2021
7 tasks
@adamjstewart adamjstewart added this to the 0.1.0 milestone Nov 20, 2021
yichiac pushed a commit to yichiac/torchgeo that referenced this pull request Apr 29, 2023
* Add Spacenet 1

* Add test data

* Style fixes

* Convert Spacenet1 to VisionDataset

* Add option for selecting imagery

* Consolidate spacenet

* Create single spacenet.py for all spacenet datasets
* Create single spacenet directory for all spacenet test data
* Create single test_spacenet.py for testing all spacenet datasets

* Add copyright

* Reorder Spacenet in docs

* Test both rgb & 8band

* Rename Spacenet -> SpaceNet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants