Question related to Ablation study & CSS Net five layers freeze #7

taeyeopl · 2021-10-14T03:45:51Z

Thanks for sharing the great work!
I have two simple questions related to the ablation study & CSS Net freeze part.

Q1. Can you explain the difference between (R,t) / (R,t),s / (R,t),s,z, which is in the main paper tab3??

Due to a limit of my understanding, it is hard to understand the difference clearly included implementation. Is it for making a label or for variables in deep sdf training?? I'm curious because I can't find a place where the all [(R,t), s, and z] parts are affected in your code.

sdflabel/utils/refinement.py

Line 501 in 416c27d

def get_kitti_label(dsdf, grid, latent, scale, trans, yaw, p_WC, bbox):

Q2. As in the code the conv1, bn1, and layer1 were frozen, Can you explain how to count the number of layers (5)??
I saw that in supplementary C.1. CSS Net, "the first five layers are frozen in order to prevent overfitting
to peculiarities of the rendered data".

sdflabel/networks/resnet_css.py

Line 156 in 416c27d

_freeze_module(self.conv1)

xmyqsh · 2021-10-19T10:24:53Z

A1: R(rotation), t(translation), s(scale), z(shape latent code[3 dim in this paper])
R,t,s can be estimated by 3D-3D correspondence estimation. The one 3D points is the back-projected Lidar Frustum Points from NOCS. The other 3d points is the DeepSDF rendering model(which is normalized and centered, which is just like sampling on CAD model) points.
Because of the 1-to-1 correspondence property of NOCS(2d map) and DeepSDF rendering model points, we can sample some of correspondence pairs, then the 3D-3D correspondence estimation can be solved by kabsch or procrustes algorithm.

z is the conditioned latent vector of DeepSDF which can be calculated for each SDF shape model by MAP in autodecode in DeepSDF. The process of MAP in autodecode is expensive. So, the MAP result z will be saved as css label. And then z could be predicted by css_net. And then z can be used from the conditioned input for DeepSDF.

def get_kitti_label(dsdf, grid, latent, scale, trans, yaw, p_WC, bbox):
The latent is generated by the MAP as described above. But the related code is not in this repo. It should be in the author's other repo.

A2: the first five layers means first five conv layers 1(self.conv1) + 4(self.layer1: for resnet18 layer1 who has 2 block consisted by 2 conv) = 5

There is some bug in the freeze code.
See the PR #8 for more detail.

taeyeopl · 2021-10-19T13:06:48Z

Thanks for the explanation!
I understood each component but still have some misunderstandings.
Sorry for my poor understanding.

Q1. Can you explain each experimental setting clearly??
It would be really helpful to understand the ablation.
As I understood, based on the equation,

[setting 1] (R,t) means does not multiply s, only using R, t, to transform DeepSDF rendering points to the Lidar coordinate.
[setting 2] (R,t,s) means consider same as equation (10),
[setting 3] (R,t,s,z) -> z part is quite hard to understand because It seems like a must-have for optimization. Can you explain the difference without/with the z??

xmyqsh · 2021-10-19T14:46:14Z

The setting 3 is the default setting of this repo. And setting 1 and setting 2 are not supported currently.
The functionality of conditioned latent code z is that we can use one DSDF for all model shape, instead of one model shape one DSDF.
I cannot imagine the setting 1 and setting 2 without the paper detail or code detail currently.
We need the author to have a explain for setting 1 and setting 2.
@zakharos

taeyeopl · 2021-10-22T05:03:46Z

I think it is not desirable to compare z for all model(single class, car) shapes and one model shape with one DSDF as an ablation study. Because the original DeepSDF considers all models (single class, car). It could make sense if your single model covers all models with multi-class(car, bike, etc). And, the driving scenario can't adopt one model shape one DSDF. It would be challenging to make models for all cars.

Nevertheless, I would appreciate it if you could explain each setting in order to have a clear understanding of the ablation study.
@zakharos

zakharos · 2021-11-05T08:05:57Z

Hi @taeyeop-lee! I apologize for the delay! Please find the answers to your questions below:

Ablation setup
The goal of the ablation is to demonstrate how different components of the pipeline affect the final downstream performance (detection). In particular, the 3 settings you are referring to demonstrate how different optimization variables - R (rotation), t (translation), s (scale), and z (latent shape code) - affect the end performance.

[setting 1] (R,t): in this setting only optimize over R and t and use an initial s scale prediction without optimizing it.
[setting 2] (R,t,s): here we optimize over rotation, translation, and scale.
[setting 3] (R,t,s,z): finally, in this case we additionally optimize over z variable, which makes it possible to also change the shape of the model during optimization.

From the results in Table 3, we see that setting 3 results in best overall perfomance.

Frozen layers
@xmyqsh is absolutely right in describing how frozen layers are computed.

xmyqsh · 2021-11-15T01:17:31Z

@zakharos
Actually, my doubt is on the [setting 1], what the value of the initial s scale is? What is the range of scale s? [0~inf]? I don't think it can get such a good result without optimizing it.
@taeyeop-lee Good suggestion. Further experiment is needed to verify it.

taeyeopl changed the title ~~Question related to CSS Net five layers freeze~~ Question related to Ablation study & CSS Net five layers freeze Oct 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question related to Ablation study & CSS Net five layers freeze #7

Question related to Ablation study & CSS Net five layers freeze #7

taeyeopl commented Oct 14, 2021 •

edited

Loading

xmyqsh commented Oct 19, 2021 •

edited

Loading

taeyeopl commented Oct 19, 2021

xmyqsh commented Oct 19, 2021 •

edited

Loading

taeyeopl commented Oct 22, 2021 •

edited

Loading

zakharos commented Nov 5, 2021

xmyqsh commented Nov 15, 2021

Question related to Ablation study & CSS Net five layers freeze #7

Question related to Ablation study & CSS Net five layers freeze #7

Comments

taeyeopl commented Oct 14, 2021 • edited Loading

xmyqsh commented Oct 19, 2021 • edited Loading

taeyeopl commented Oct 19, 2021

xmyqsh commented Oct 19, 2021 • edited Loading

taeyeopl commented Oct 22, 2021 • edited Loading

zakharos commented Nov 5, 2021

xmyqsh commented Nov 15, 2021

taeyeopl commented Oct 14, 2021 •

edited

Loading

xmyqsh commented Oct 19, 2021 •

edited

Loading

xmyqsh commented Oct 19, 2021 •

edited

Loading

taeyeopl commented Oct 22, 2021 •

edited

Loading