Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing benchmark for CARLA #66

Open
lippoldt opened this issue Jan 18, 2024 · 1 comment
Open

Reproducing benchmark for CARLA #66

lippoldt opened this issue Jan 18, 2024 · 1 comment

Comments

@lippoldt
Copy link

I am currently testing capabilities of NKSR and for that purpose was running some tests on scores.
I have downloaded the CARLA data you provided and executed the test script together with the metrics provided.
The metrics on the pretrained backbone of CARLA are as follows:

completeness (68) 0.027692637713934216
accuracy (68) 0.03867699400040561
normals completeness (68) 0.965071364518667
normals accuracy (68) 0.9516956805795869
normals (68) 0.9583835225491268
completeness2 (68) 0.025281473814217505
accuracy2 (68) 0.007717492560607829
chamfer-L2 (68) 0.016499483187412674
chamfer-L1 (68) 0.03318481585716991
f-precision (68) 0.2145197705882353
f-recall (68) 0.28277397766826823
f-score (68) 0.24255425021785418
f-score-15 (68) 0.4549117034594062
f-score-20 (68) 0.6125048929714518

According to the paper, the F-score should be above 0.9.

I have also been testing the training procedure for the CARLA data - and while validation accuracies look very promising (they are also above 90%), test f-scores are again low.

I have also been switching the precision from 32 to 64, however have not achieved any large improvement.

How can I reproduce the numbers from the paper?
Screenshot (1405)

@heiwang1997
Copy link
Collaborator

heiwang1997 commented Feb 7, 2024

Hi @lippoldt thank you for reaching out. For Table 3 we use a different threshold for computing the F-Score. This is clarified in the appendix of our paper, shown as below:

image

To evaluate with this F-Score, would you please change metric_names=MeshEvaluator.ESSENTIAL_METRICS in the following lines:

NKSR/models/nksr_net.py

Lines 301 to 303 in 0d4e369

evaluator = MeshEvaluator(
n_points=int(5e6) if ref_geometry is not None else int(5e5),
metric_names=MeshEvaluator.ESSENTIAL_METRICS)

into metric_names=['f-score-outdoor'] and try again? The reason why we use a different score is because the scale of the datasets are essentially different.

Sorry for the delay in response and I am happy to assist you with further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants