Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduce ScanNet200 Results #38

Open
Louis708 opened this issue Oct 27, 2024 · 4 comments
Open

Reproduce ScanNet200 Results #38

Louis708 opened this issue Oct 27, 2024 · 4 comments

Comments

@Louis708
Copy link

Louis708 commented Oct 27, 2024

Hi @PhucNDA ,

I try to reproduce ScanNet200 results. I prepare the data and follow your running code instructions. I get the below results:

ScanNet200 Evaluation
################################################
what           :      AP  AP_50%  AP_25%
################################################
Head AP        :   0.254   0.314   0.342
Common AP      :   0.209   0.259   0.282
Tail AP        :   0.212   0.260   0.295
Base AP        :   0.246   0.308   0.342
Novel AP       :   0.218   0.267   0.294
------------------------------------------------
AP             :   0.226   0.279   0.307
################################################

It seems that the results of 0.226, 0.279, 0.307 is a bit different from the paper's 0.237, 0.294, 0.328. Is the gap within an acceptable range? Or am I missing some steps during reproducing?

Here is the model I use:

Grounded-SAM (groundingdino_swint_ogc, sam_vit_h_4b8939), CLIP (ViT-L/14@336px)

I only change the file path in config and the agnostic flag into False. Here is my config:

proposals:
p2d: True # 2D branch
p3d: True # 3D branch
agnostic: False
refined: True

Here is my understanding of running the code:

  1. grounding_2d.sh: Generate the 2D masks (maskGdino) and first stage feature (grounded_feat). This step taks a lot of hours to run.

  2. generate_3d_inst.sh: Generate 3D instances (hier_agglo) from 2D masks using hierarchical agglomerative clustering.

  3. refine_grounding_feat: Refine second stage feature (hier_agglo) from 3D instances, output refine features (refined_grounded_feat)

  4. generate_3d_inst.sh: Finalize the 3D output masks from refine features (refined_grounded_feat). In this step, I change the bool here into False


    and the bool here into True
    to get the final output (final_result_hier_agglo) instead of 3D instances (hier_agglo).

I understand that it's hard to figure out the problem that I encounter. Your work is very cool. If you could help me out and give me some insight I would really appreciate it.

Thanks.

@PhucNDA
Copy link
Collaborator

PhucNDA commented Oct 27, 2024

Hi @Louis708,

How did you generate 3D feature from ISBNet?

@PhucNDA
Copy link
Collaborator

PhucNDA commented Oct 27, 2024

Hi @Louis708,

You may want to independently verify the results for the 3D backbone-only and 2D-only cases to help identify the bug. These specific results are available on our webpage. If you encounter any issues with the source code, please don’t hesitate to reach out to me.

@Louis708
Copy link
Author

Hi @PhucNDA ,

Thanks for very quick reply. I generate the 3D features from ISBNet using

cd segmenter3d/ISBNet/
python3 tools/test.py configs/scannet200/isbnet_scannet200.yaml pretrains/scannet200/head_scannetv2_200_val.pth
in https://github.com/VinAIResearch/Open3DIS/blob/main/docs/DATA.md#3d-backbone

I will check the results for the 3D backbone-only and 2D-only on ScanNet 200 later.

Thanks

@sgmzhou4
Copy link

Hi @PhucNDA ,

I try to reproduce ScanNet200 results. I prepare the data and follow your running code instructions. I get the below results:

ScanNet200 Evaluation
################################################
what           :      AP  AP_50%  AP_25%
################################################
Head AP        :   0.254   0.314   0.342
Common AP      :   0.209   0.259   0.282
Tail AP        :   0.212   0.260   0.295
Base AP        :   0.246   0.308   0.342
Novel AP       :   0.218   0.267   0.294
------------------------------------------------
AP             :   0.226   0.279   0.307
################################################

It seems that the results of 0.226, 0.279, 0.307 is a bit different from the paper's 0.237, 0.294, 0.328. Is the gap within an acceptable range? Or am I missing some steps during reproducing?

Here is the model I use:

Grounded-SAM (groundingdino_swint_ogc, sam_vit_h_4b8939), CLIP (ViT-L/14@336px)

I only change the file path in config and the agnostic flag into False. Here is my config:

proposals: p2d: True # 2D branch p3d: True # 3D branch agnostic: False refined: True

Here is my understanding of running the code:

  1. grounding_2d.sh: Generate the 2D masks (maskGdino) and first stage feature (grounded_feat). This step taks a lot of hours to run.

  2. generate_3d_inst.sh: Generate 3D instances (hier_agglo) from 2D masks using hierarchical agglomerative clustering.

  3. refine_grounding_feat: Refine second stage feature (hier_agglo) from 3D instances, output refine features (refined_grounded_feat)

  4. generate_3d_inst.sh: Finalize the 3D output masks from refine features (refined_grounded_feat). In this step, I change the bool here into False

    and the bool here into True

    to get the final output (final_result_hier_agglo) instead of 3D instances (hier_agglo).

I understand that it's hard to figure out the problem that I encounter. Your work is very cool. If you could help me out and give me some insight I would really appreciate it.

Thanks.

Would you mind sharing how did you modify the eval.py and your GT file plz?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants