You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for your sharing. I am confused by how you prepare ground-truth reports for evaluation.
As shown in modules/datasets.py#L20, you truncate the report to ensure its maximum length is max_seq_length. This is OK for training. However, during inference, as shown in modules/tester.py#L83, you take the truncated reports as ground truths and use pycocoeval to calculate scores.
In my opinion, the ground-truth reports should not be truncated, as this will remove some important details and lead to relatively high results.
Do I get it wrong? Looking forward to your reply
The text was updated successfully, but these errors were encountered:
When evaluating the same model with the same hyper-parameters (e.g., beam size = 3), we will get different results when we alter max_seq_ength (e.g., from 128 to 64), as it will affect the length of ground-truth reports. This is not what we expect. The ground-truth reports should be consistent and not truncated.
Hi, thanks for your sharing. I am confused by how you prepare ground-truth reports for evaluation.
As shown in modules/datasets.py#L20, you truncate the report to ensure its maximum length is
max_seq_length
. This is OK for training. However, during inference, as shown in modules/tester.py#L83, you take the truncated reports as ground truths and usepycocoeval
to calculate scores.In my opinion, the ground-truth reports should not be truncated, as this will remove some important details and lead to relatively high results.
Do I get it wrong? Looking forward to your reply
The text was updated successfully, but these errors were encountered: