Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why do you truncate the ground-truth reports for evaluation? #10

Open
yangbang18 opened this issue May 18, 2024 · 2 comments
Open

Why do you truncate the ground-truth reports for evaluation? #10

yangbang18 opened this issue May 18, 2024 · 2 comments

Comments

@yangbang18
Copy link

yangbang18 commented May 18, 2024

Hi, thanks for your sharing. I am confused by how you prepare ground-truth reports for evaluation.

As shown in modules/datasets.py#L20, you truncate the report to ensure its maximum length is max_seq_length. This is OK for training. However, during inference, as shown in modules/tester.py#L83, you take the truncated reports as ground truths and use pycocoeval to calculate scores.

In my opinion, the ground-truth reports should not be truncated, as this will remove some important details and lead to relatively high results.

Do I get it wrong? Looking forward to your reply

@yangbang18
Copy link
Author

Imagine a simple case:

When evaluating the same model with the same hyper-parameters (e.g., beam size = 3), we will get different results when we alter max_seq_ength (e.g., from 128 to 64), as it will affect the length of ground-truth reports. This is not what we expect. The ground-truth reports should be consistent and not truncated.

@keybo-hk
Copy link

为什么最好的结果总出现在前几轮

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants