-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config for irrigation_scenes and custom SpatioTemporalDataset loader #13
base: main
Are you sure you want to change the base?
Conversation
Initial mmsegmentation configuration file for the irrigation_scenes dataset on https://huggingface.co/datasets/ibm-nasa-geospatial/hls_irrigation_scenes. As this is a time-series dataset with data from four months stored in four different folders, a custom SpatioTemporalDataset class (subclassed from GeospatialDataset) and LoadSpatioTemporalImagesFromFile class (subclassed from LoadGeospatialImageFromFile) was created to perform the data loading. Training with only the first 3 months (June, July, August) for now. Also updated the fine-tuning-examples/README.md to mention how to run the irrigation_scenes setup.
Config folder has moved from the fine-tuning-examples folder up to the root directory in 464e9f2/NASA-IMPACT#8, so no need to do `../` anymore.
The old open_tiff function used rasterio.open which stacked the bands/channels in the first position (CHW), but moving to tiffile.imread in 86e9ba9 changed the stacking to the last position (HWC). Need to use channel last (NHWC) for the RandomFlip function since it is somewhat hardcoded to flip on axis 1, and then use TorchPermute to change to channel first (NCHW) so that TorchNormalize (using torchvision which expects BCHW) works.
Hacky way to avoid `KeyError: 'ann_info'` by setting `results["ann_info"]["seg_map"]` to `results["img_info"]["ann"]["seg_map"]`. Also edited docstring of the LoadGeospatialAnnotations class slightly. Cherry-picked from NASA-IMPACT/hls-foundation@e5fb7ab.
Making sure that the test_pipeline is consistent with the training and validation pipeline.
Getting a 2023-08-03 17:03:11,629 - mmseg - INFO - workflow: [('train', 1)], max: 5000 iters
2023-08-03 17:03:11,629 - mmseg - INFO - Checkpoints will be saved to finetune_weights/irrigation_scenes/test_1/test_1 by HardDiskBackend.
2023-08-03 17:03:22,052 - mmcv - INFO - Reducer buckets have been rebuilt in this iteration.
2023-08-03 17:03:58,935 - mmseg - INFO - Iter [20/5000] lr: 1.893e-07, eta: 3:15:17, time: 2.353, data_time: 0.046, memory: 6031, decode.loss_ce: 3.3195, decode.acc_seg: 6.3938, aux.loss_ce: 3.4039, aux.acc_seg: 1.7806, loss: 6.7234
[ ] 0/281, elapsed: 0s, ETA:Traceback (most recent call last):
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/.mim/tools/train.py", line 242, in <module>
main()
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/.mim/tools/train.py", line 231, in main
train_segmentor(
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/apis/train.py", line 194, in train_segmentor
runner.run(data_loaders, cfg.workflow)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmcv/runner/iter_based_runner.py", line 134, in run
iter_runner(iter_loaders[i], **kwargs)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmcv/runner/iter_based_runner.py", line 67, in train
self.call_hook('after_train_iter')
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
getattr(hook, fn_name)(self)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmcv/runner/hooks/evaluation.py", line 262, in after_train_iter
self._do_evaluate(runner)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/core/evaluation/eval_hooks.py", line 117, in _do_evaluate
results = multi_gpu_test(
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/apis/test.py", line 208, in multi_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmcv/runner/fp16_utils.py", line 110, in new_func
return old_func(*args, **kwargs)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/models/segmentors/base.py", line 110, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/models/segmentors/base.py", line 74, in forward_test
raise TypeError(f'{name} must be a list, but got '
TypeError: imgs must be a list, but got <class 'torch.Tensor'> This is the same one reported before at https://github.com/NASA-IMPACT/hls-foundation/pull/30#issuecomment-1603652525, which was fixed with some hacky workarounds to modify the default collate function in mmsegmentation's code here: Doesn't look possible to apply the same old workaround here anymore, so would need to find a different solution. Xref upstream issue at open-mmlab/mmsegmentation#2410 |
A mmsegmentation configuration file for the irrigation_scenes dataset on https://huggingface.co/datasets/ibm-nasa-geospatial/hls_irrigation_scenes.
As this is a time-series dataset with data from four months stored in four different folders, a custom SpatioTemporalDataset class (subclassed from GeospatialDataset) and LoadSpatioTemporalImagesFromFile class (subclassed from LoadGeospatialImageFromFile) was created to perform the data loading. Training with only the first 3 months (June, July, August) for now. Also updated the fine-tuning-examples/README.md to mention how to run the irrigation_scenes setup.
Xref original work at https://github.com/NASA-IMPACT/hls-foundation/pull/30 and https://github.com/NASA-IMPACT/hls-foundation/pull/35
P.S. This is the same branch as #4, but that one got closed somehow during the private->public conversion of the repo.