Weight Tensor Dimension Issue when training on SemanticKitti #567

SvenMala · 2022-11-11T01:40:55Z

Checklist

I have searched for similar issues.
I have tested with the latest development wheel.
I have checked the release documentation and the latest documentation (for master branch).

Describe the issue

I have been trying to train KPConv on Semantickitti using the Pytorch pipeline.

I use the default /home/user/Open3D-ML-master/ml3d/configs/kpconv_semantickitti.yml config file with adding "pin_memory: False" in pipeline.

The dataset was downloaded using the /home/user/Open3D-ML-master/scripts/download_datasets/download_semantickitti.sh script.

After successfully running through the preprocessing, I keep the getting the runtime error, that seemingly comes from an unexpected weight tensor dimension.

RuntimeError: weight tensor should be defined either for all 19 classes or no classes but got weight tensor of shape: [1, 19]

I get the same error using the RandLANet model.

Please give me any advice on how to deal with this issue.

Steps to reproduce the bug

import os
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d

dataset = ml3d.datasets.SemanticKITTI(dataset_path='/Datasets/SemanticKitti', use_cache=True)


cfg_file = "/Open3D-ML-master/ml3d/configs/kpconv_semantickitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)


# create the model with random initialization.
model = ml3d.models.KPFCNN(**cfg.model)

pipeline = ml3d.pipelines.SemanticSegmentation(model=model, dataset=dataset,num_workers=1,device="cpu",**cfg.pipeline)

# prints training progress in the console.
pipeline.run_train()

Error message

File "/home/user/anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/spyder_kernels/py3compat.py", line 356, in compat_exec
exec(code, globals, locals)

File "/home/user//Desktop/run_the_training.py", line 21, in
pipeline.run_train()

File "/home/user//anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/open3d/_ml3d/torch/pipelines/semantic_segmentation.py", line 411, in run_train
loss, gt_labels, predict_scores = model.get_loss(

File "/home/user//anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/open3d/_ml3d/torch/models/kpconv.py", line 339, in get_loss
self.output_loss = Loss.weighted_CrossEntropyLoss(scores, labels)

File "/home/user//anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)

File "/home/user//anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/torch/nn/modules/loss.py", line 1164, in forward
return F.cross_entropy(input, target, weight=self.weight,

File "/home/user//anaconda3/envs/pointcloud_pytorch/lib/python3.10/site-packages/torch/nn/functional.py", line 3014, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)

RuntimeError: weight tensor should be defined either for all 19 classes or no classes but got weight tensor of shape: [1, 19]

Expected behavior

I was expecting the model to train on the Semantic Kitti dataset. But, I keep getting the error.

Open3D, Python and System information

- Operating system: Ubuntu 20.04
- Python version: 3.10.6 
- Open3D version: 0.16.0
- Is this remote workstation?: no
- How did you install Open3D?: pip

Additional information

No response

The text was updated successfully, but these errors were encountered:

ashishrana160796 · 2022-11-24T17:21:25Z

Hello @SvenMala, faced this similar issue while training RandLa-Net on my custom created dataset.
I think the root cause for this issue is the shape of the class_weights parameter that is being used for training these models.

My first quick-fix is to drop out the class weight vector, since for my initial prototyping it was not of much use, and training worked for me (performance loss is there for highly unbalanced datasets).

Second, quick-fix is to update the CrossEntropyLoss in file /usr/local/lib/python3.8/dist-packages/open3d/_ml3d/torch/modules/losses/semseg_loss.py after keeping the class_weights as empty list in config file. The last line should somewhat looks like the below provided code snippet. Basically you are adding class weights into the regular CrossEntropyLoss function and making its functionality equivalent to a weighted one.

...
...
        else:
            weights = torch.tensor([1.0, 4.0, 10.0, 10.0, 5.0, 1.5], dtype=torch.float, device=device)
            self.weighted_CrossEntropyLoss = nn.CrossEntropyLoss(weight=weights, label_smoothing=0.14)

SvenMala added the bug Something isn't working label Nov 11, 2022

ssheorey linked a pull request Jun 13, 2023 that will close this issue

Fix pytorch 1.13 error due to 2D weights #601

Merged

ssheorey mentioned this issue Jun 13, 2023

Fix pytorch 1.13 error due to 2D weights #601

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weight Tensor Dimension Issue when training on SemanticKitti #567

Weight Tensor Dimension Issue when training on SemanticKitti #567

SvenMala commented Nov 11, 2022

ashishrana160796 commented Nov 24, 2022 •

edited

Loading

Weight Tensor Dimension Issue when training on SemanticKitti #567

Weight Tensor Dimension Issue when training on SemanticKitti #567

Comments

SvenMala commented Nov 11, 2022

Checklist

Describe the issue

Steps to reproduce the bug

Error message

Expected behavior

Open3D, Python and System information

Additional information

ashishrana160796 commented Nov 24, 2022 • edited Loading

ashishrana160796 commented Nov 24, 2022 •

edited

Loading