Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weight Tensor Dimension Issue when training on SemanticKitti #567

Open
3 tasks done
SvenMala opened this issue Nov 11, 2022 · 1 comment · Fixed by #601
Open
3 tasks done

Weight Tensor Dimension Issue when training on SemanticKitti #567

SvenMala opened this issue Nov 11, 2022 · 1 comment · Fixed by #601
Labels
bug Something isn't working

Comments

@SvenMala
Copy link

Checklist

Describe the issue

I have been trying to train KPConv on Semantickitti using the Pytorch pipeline.

I use the default /home/user/Open3D-ML-master/ml3d/configs/kpconv_semantickitti.yml config file with adding "pin_memory: False" in pipeline.

The dataset was downloaded using the /home/user/Open3D-ML-master/scripts/download_datasets/download_semantickitti.sh script.

After successfully running through the preprocessing, I keep the getting the runtime error, that seemingly comes from an unexpected weight tensor dimension.

RuntimeError: weight tensor should be defined either for all 19 classes or no classes but got weight tensor of shape: [1, 19]

I get the same error using the RandLANet model.

Please give me any advice on how to deal with this issue.

Steps to reproduce the bug

import os
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d

dataset = ml3d.datasets.SemanticKITTI(dataset_path='/Datasets/SemanticKitti', use_cache=True)


cfg_file = "/Open3D-ML-master/ml3d/configs/kpconv_semantickitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)


# create the model with random initialization.
model = ml3d.models.KPFCNN(**cfg.model)

pipeline = ml3d.pipelines.SemanticSegmentation(model=model, dataset=dataset,num_workers=1,device="cpu",**cfg.pipeline)

# prints training progress in the console.
pipeline.run_train()

Error message

File "/home/user/anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/spyder_kernels/py3compat.py", line 356, in compat_exec
exec(code, globals, locals)

File "/home/user//Desktop/run_the_training.py", line 21, in
pipeline.run_train()

File "/home/user//anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/open3d/_ml3d/torch/pipelines/semantic_segmentation.py", line 411, in run_train
loss, gt_labels, predict_scores = model.get_loss(

File "/home/user//anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/open3d/_ml3d/torch/models/kpconv.py", line 339, in get_loss
self.output_loss = Loss.weighted_CrossEntropyLoss(scores, labels)

File "/home/user//anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)

File "/home/user//anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/torch/nn/modules/loss.py", line 1164, in forward
return F.cross_entropy(input, target, weight=self.weight,

File "/home/user//anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/torch/nn/functional.py", line 3014, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)

RuntimeError: weight tensor should be defined either for all 19 classes or no classes but got weight tensor of shape: [1, 19]

Expected behavior

I was expecting the model to train on the Semantic Kitti dataset. But, I keep getting the error.

Open3D, Python and System information

- Operating system: Ubuntu 20.04
- Python version: 3.10.6 
- Open3D version: 0.16.0
- Is this remote workstation?: no
- How did you install Open3D?: pip

Additional information

No response

@SvenMala SvenMala added the bug Something isn't working label Nov 11, 2022
@ashishrana160796
Copy link

ashishrana160796 commented Nov 24, 2022

Hello @SvenMala, faced this similar issue while training RandLa-Net on my custom created dataset.
I think the root cause for this issue is the shape of the class_weights parameter that is being used for training these models.

My first quick-fix is to drop out the class weight vector, since for my initial prototyping it was not of much use, and training worked for me (performance loss is there for highly unbalanced datasets).

Second, quick-fix is to update the CrossEntropyLoss in file /usr/local/lib/python3.8/dist-packages/open3d/_ml3d/torch/modules/losses/semseg_loss.py after keeping the class_weights as empty list in config file. The last line should somewhat looks like the below provided code snippet. Basically you are adding class weights into the regular CrossEntropyLoss function and making its functionality equivalent to a weighted one.

...
...
        else:
            weights = torch.tensor([1.0, 4.0, 10.0, 10.0, 5.0, 1.5], dtype=torch.float, device=device)
            self.weighted_CrossEntropyLoss = nn.CrossEntropyLoss(weight=weights, label_smoothing=0.14)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants