kitti-pretrain loss and acc problem #8

XuekuanWang · 2022-11-10T06:49:18Z

No description provided.

XuekuanWang · 2022-11-10T06:51:54Z

hello,
we have trained the simipu-kitti model, and found the intro-loss is lower than cross loss，but acc top1 is lower。

2022-11-10 14:44:13,589 - mmdet - INFO - Epoch [122][30/58] lr: 2.964e-04, eta: 1:24:15, time: 2.173, data_time: 0.245, memory: 59185, cross_acc_
top1: 71.7124, cross_acc_top5: 93.2814, cross_loss: 6.9118, intro_loss: 2.8096, intro_acc_top1: 36.9674, intro_acc_top5: 73.1265, loss: 9.7215
2022-11-10 14:46:14,595 - mmdet - INFO - Epoch [123][30/58] lr: 2.934e-04, eta: 1:23:11, time: 2.193, data_time: 0.189, memory: 59185, cross_acc_
top1: 71.1550, cross_acc_top5: 92.9795, cross_loss: 6.9552, intro_loss: 2.8046, intro_acc_top1: 36.8098, intro_acc_top5: 72.9778, loss: 9.7598

I think the intro loss is easy to learn, so the acc maybe higher. This result is right？

zhyever · 2022-11-10T07:34:48Z

Maybe not. Remember that we need to adopt a matching algorithm to get positive pairs in the intro branch. There could be mistakes in matching. Also, the positive pairs do not exactly locate at the same spatial position in 3D space. These things also make intro-learning more difficult.

However, while there're tons of problems, the intro-branch features are in the same representative space (i.e, extracted by PointNet in this work), and the feature similarity is higher compared with image features & LiDAR features (cross-branch). Hence, it could be a small loss (more similar) but lower acc (hard to distinguish).

These are from an intuitive view. To further step, I guess one may need to research when the contrastive loss is lower.

XuekuanWang · 2022-11-10T08:36:09Z

Thanks.

And I try to reproduce the experimental results of the paper. But fail.

3D-AP
Easy | Mod | Hard
-- | -- | -- | -- | -- | -- | -- | -- | --
paper | 81.32% | 70.88% | 66.19%
paper 无 pretrain | 79.17% | 68.58% | 64.81%
100epoch(pretrain)+100epoch(下游任务) | /79.49% | 68.54% | 64.23%
无预训练-40epoch | / | / | / | / | / | 77.59% | 67.57% | 61.78%
无预训练-100epoch | / | / | / | / | / | 78.60% | 68.75% | 64.66%

What is the reason？

1）check the kitti-pretrain model is right？The log is right?

2022-11-09 03:34:19,568 - mmdet - INFO - Epoch [100][90/116] lr: 3.697e-04, eta: 0:00:20, time: 0.999, data_time: 0.033, memory: 29223, cross_acc_top1: 75.8513, cross_acc_top5: 95.9727, cross_loss: 6.0896, intro_loss: 2.7013, intro_acc_top1: 37.7396, intro_acc_top5: 73.4069, loss: 8.7908
2022-11-09 03:34:45,791 - mmdet - INFO - Saving checkpoint at 100 epochs

I try to use the open kitti-pretrain model, but fail. There is missing many keys.

Can you help me reproduce the experimental results of the paper? Thanks.

zhyever · 2022-11-10T08:45:20Z

Please refer to the 3D detection log presented here. The model performance in the last several epochs is consistently better than the baseline w/o pertaining. No other tricks were adopted in our exps and the log is exactly corresponding to the exp reported in our paper.

For the pre-trained model, please let me know what keys are missing. I have no idea if I upload the wrong models.

XuekuanWang · 2022-11-10T11:41:01Z

ok, thanks.
It is my log when load pretrion model ==== > SimIPU_kitti_50e.pth
There has an error "unexpected key in source state_dict".

'please set runner in your config.', UserWarning)
2022-11-10 17:54:12,058 - mmdet - INFO - load checkpoint from local path: /root/paddlejob/workspace/env_run/kuan/exp/simipu/SimIPU_kitti_50e.pth
2022-11-10 17:54:12,135 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: backbone.conv1.weight, backbone.bn1.weight, backbone.bn1.bias, backbone.bn1.running_mean, backbone.bn1.running_var, backbone.bn1.num_batches_tracked, backbone.layer1.0.conv1.weight, backbone.layer1.0.bn1.weight, backbone.layer1.0.bn1.bias, backbone.layer1.0.bn1.running_mean, backbone.layer1.0.bn1.running_var, backbone.layer1.0.bn1.num_batches_tracked, backbone.layer1.0.conv2.weight, backbone.layer1.0.bn2.weight, backbone.layer1.0.bn2.bias, backbone.layer1.0.bn2.running_mean, backbone.layer1.0.bn2.running_var, backbone.layer1.0.bn2.num_batches_tracked, backbone.layer1.0.conv3.weight, backbone.layer1.0.bn3.weight, backbone.layer1.0.bn3.bias, backbone.layer1.0.bn3.running_mean, backbone.layer1.0.bn3.running_var, backbone.layer1.0.bn3.num_batches_tracked, backbone.layer1.0.downsample.0.weight, backbone.layer1.0.downsample.1.weight, backbone.layer1.0.downsample.1.bias, backbone.layer1.0.downsample.1.running_mean, backbone.layer1.0.downsample.1.running_var, backbone.layer1.0.downsample.1.num_batches_tracked, backbone.layer1.1.conv1.weight, backbone.layer1.1.bn1.weight, backbone.layer1.1.bn1.bias, backbone.layer1.1.bn1.running_mean, backbone.layer1.1.bn1.running_var, backbone.layer1.1.bn1.num_batches_tracked, backbone.layer1.1.conv2.weight, backbone.layer1.1.bn2.weight, backbone.layer1.1.bn2.bias, backbone.layer1.1.bn2.running_mean, backbone.layer1.1.bn2.running_var, backbone.layer1.1.bn2.num_batches_tracked, backbone.layer1.1.conv3.weight, backbone.layer1.1.bn3.weight, backbone.layer1.1.bn3.bias, backbone.layer1.1.bn3.running_mean, backbone.layer1.1.bn3.running_var, backbone.layer1.1.bn3.num_batches_tracked, backbone.layer1.2.conv1.weight, backbone.layer1.2.bn1.weight, backbone.layer1.2.bn1.bias, backbone.layer1.2.bn1.running_mean, backbone.layer1.2.bn1.running_var, backbone.layer1.2.bn1.num_batches_tracked, backbone.layer1.2.conv2.weight, backbone.layer1.2.bn2.weight, backbone.layer1.2.bn2.bias, backbone.layer1.2.bn2.running_mean, backbone.layer1.2.bn2.running_var, backbone.layer1.2.bn2.num_batches_tracked, backbone.layer1.2.conv3.weight, backbone.layer1.2.bn3.weight, backbone.layer1.2.bn3.bias, backbone.layer1.2.bn3.running_mean, backbone.layer1.2.bn3.running_var, backbone.layer1.2.bn3.num_batches_tracked, backbone.layer2.0.conv1.weight, backbone.layer2.0.bn1.weight, backbone.layer2.0.bn1.bias, backbone.layer2.0.bn1.running_mean, backbone.layer2.0.bn1.running_var, backbone.layer2.0.bn1.num_batches_tracked, backbone.layer2.0.conv2.weight, backbone.layer2.0.bn2.weight, backbone.layer2.0.bn2.bias, backbone.layer2.0.bn2.running_mean, backbone.layer2.0.bn2.running_var, backbone.layer2.0.bn2.num_batches_tracked, backbone.layer2.0.conv3.weight, backbone.layer2.0.bn3.weight, backbone.layer2.0.bn3.bias, backbone.layer2.0.bn3.running_mean, backbone.layer2.0.bn3.running_var, backbone.layer2.0.bn3.num_batches_tracked, backbone.layer2.0.downsample.0.weight, backbone.layer2.0.downsample.1.weight, backbone.layer2.0.downsample.1.bias, backbone.layer2.0.downsample.1.running_mean, backbone.layer2.0.downsample.1.running_var, backbone.layer2.0.downsample.1.num_batches_tracked, backbone.layer2.1.conv1.weight, backbone.layer2.1.bn1.weight, backbone.layer2.1.bn1.bias, backbone.layer2.1.bn1.running_mean, backbone.layer2.1.bn1.running_var, backbone.layer2.1.bn1.num_batches_tracked, backbone.layer2.1.conv2.weight, backbone.layer2.1.bn2.weight, backbone.l

zhyever · 2022-11-10T11:43:55Z

Could you please try other SimIPU pre-trained models, so that I can ensure that it is my mistake uploading wrong models.

XuekuanWang · 2022-11-10T12:41:40Z

Ok, I will try other pretrain model.
There is the params of moca_r50_kitti. The key is not match ": backbone.conv1.weight, backbone.bn1.weight,"?

(img_backbone): ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): ResLayer(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)

zhyever · 2022-11-29T08:35:12Z

:D Hi, I would like to ask if there is any update.

Remember when pre-training, there are two encoders actually (img, point cloud). So when loading the parameters, the point cloud part could be miss-matched but the img encoder parameters could be successfully loaded.

If there is a bug, you could change the key name in the parameter dict provided in this repo. For example, you may change the backbone to img_backbone.

XuekuanWang changed the title ~~intro~~ kitti-pretrain loss and acc problem Nov 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kitti-pretrain loss and acc problem #8

kitti-pretrain loss and acc problem #8

XuekuanWang commented Nov 10, 2022

XuekuanWang commented Nov 10, 2022 •

edited

Loading

zhyever commented Nov 10, 2022

XuekuanWang commented Nov 10, 2022 •

edited

Loading

zhyever commented Nov 10, 2022

XuekuanWang commented Nov 10, 2022 •

edited

Loading

zhyever commented Nov 10, 2022

XuekuanWang commented Nov 10, 2022

zhyever commented Nov 29, 2022

kitti-pretrain loss and acc problem #8

kitti-pretrain loss and acc problem #8

Comments

XuekuanWang commented Nov 10, 2022

XuekuanWang commented Nov 10, 2022 • edited Loading

zhyever commented Nov 10, 2022

XuekuanWang commented Nov 10, 2022 • edited Loading

zhyever commented Nov 10, 2022

XuekuanWang commented Nov 10, 2022 • edited Loading

zhyever commented Nov 10, 2022

XuekuanWang commented Nov 10, 2022

zhyever commented Nov 29, 2022

XuekuanWang commented Nov 10, 2022 •

edited

Loading

XuekuanWang commented Nov 10, 2022 •

edited

Loading

XuekuanWang commented Nov 10, 2022 •

edited

Loading