You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here both train_image_ds and train_mask_ds have length of 22, and cover the exact same spatial areas (i.e. there is a 1-to-1 pairing between a tile in train_image_ds and a tile in train_mask_ds). It looks something like this:
The issue is that train_ds unexpectedly has a length of 140. Specifically, the merged index has 140 entries, however only 22 of them (as expected) have an area > 0. I'm guessing this is why we filter out intersection areas with area <= 0 in the samplers, but don't remember the details!
I recommend that we filter areas of intersection with area 0 when merging datasets.
The text was updated successfully, but these errors were encountered:
The reason for this issue is that rtree considers two bounding boxes to be overlapping even if the area of overlap is 0.
It isn't hard to add a check for this and remove them from the intersection, or from the sampler. The reason we haven't done this already is that some datasets have 0 area on purpose. We have several point GeoDatasets, including GBIF, iNaturalist, and EDDMapS, and I have plans to add others for air pollution as well. I'm not actively using these datasets, and I'm not even sure if our builtin samplers would be useful for these kinds of datasets, but that's the reason things are the way they are. I would be open to changing this, but would need to think about how else we could use point datasets without 0 area files. Could add a parameter to control this I suppose.
calebrob6
changed the title
Zero area intersections in IntersectionDataset
Zero area intersections in IntersectionDataset result in unexpected dataset lengths
Apr 20, 2023
Just clarified the title to emphasize that the problem is that the reported length of the IntersectionDataset does not match the expected length which is confusing to users.
Issue
We create an IntersectionDataset like this:
Here both
train_image_ds
andtrain_mask_ds
have length of 22, and cover the exact same spatial areas (i.e. there is a 1-to-1 pairing between a tile intrain_image_ds
and a tile intrain_mask_ds
). It looks something like this:The issue is that
train_ds
unexpectedly has a length of 140. Specifically, the merged index has 140 entries, however only 22 of them (as expected) have anarea > 0
. I'm guessing this is why we filter out intersection areas with area <= 0 in the samplers, but don't remember the details!I recommend that we filter areas of intersection with area 0 when merging datasets.
The text was updated successfully, but these errors were encountered: