Why do you truncate the ground-truth reports for evaluation? #10

yangbang18 · 2024-05-18T05:48:03Z

Hi, thanks for your sharing. I am confused by how you prepare ground-truth reports for evaluation.

As shown in modules/datasets.py#L20, you truncate the report to ensure its maximum length is max_seq_length. This is OK for training. However, during inference, as shown in modules/tester.py#L83, you take the truncated reports as ground truths and use pycocoeval to calculate scores.

In my opinion, the ground-truth reports should not be truncated, as this will remove some important details and lead to relatively high results.

Do I get it wrong? Looking forward to your reply

The text was updated successfully, but these errors were encountered:

yangbang18 · 2024-05-18T05:53:39Z

Imagine a simple case:

When evaluating the same model with the same hyper-parameters (e.g., beam size = 3), we will get different results when we alter max_seq_ength (e.g., from 128 to 64), as it will affect the length of ground-truth reports. This is not what we expect. The ground-truth reports should be consistent and not truncated.

keybo-hk · 2024-11-22T15:33:06Z

为什么最好的结果总出现在前几轮

Yingshu-Li mentioned this issue Oct 29, 2024

Why truncate the ground-truth reports for evaluation wjhou/ORGan#13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why do you truncate the ground-truth reports for evaluation? #10

Why do you truncate the ground-truth reports for evaluation? #10

yangbang18 commented May 18, 2024 •

edited

Loading

yangbang18 commented May 18, 2024

keybo-hk commented Nov 22, 2024

Why do you truncate the ground-truth reports for evaluation? #10

Why do you truncate the ground-truth reports for evaluation? #10

Comments

yangbang18 commented May 18, 2024 • edited Loading

yangbang18 commented May 18, 2024

keybo-hk commented Nov 22, 2024

yangbang18 commented May 18, 2024 •

edited

Loading