Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Fix] validation_step #134

Merged
merged 3 commits into from
Jan 3, 2025
Merged

Conversation

Adamusen
Copy link
Contributor

@Adamusen Adamusen commented Dec 9, 2024

Changed the self.metric() call to self.metric.update().
Removed the logging of per batch metric values.
Filtered out the dummy boxes from the per image "target" tensors before appending them to the metrics accumulator.

fixes #133

Changed the self.metric() call to self.metric.update().
Removed the logging of per batch metric values.
Filtered out the dummy boxes from the per image "target" tensors before appending them to the metrics accumulator.

fixes WongKinYiu#133
@henrytsui000
Copy link
Collaborator

Thank you for your pull request!

I may need to postpone merging this PR because my original plan was to display the estimated mAP in a progress bar. This would allow debuggers to quickly monitor the validation state without having to wait until the entire validation process is complete. However, I haven’t finished implementing that feature yet.

That said, I’ll definitely consider your PR since it significantly speeds up the process.

Best regards,
Henry Tsui

@Adamusen
Copy link
Contributor Author

Adamusen commented Jan 3, 2025

Hey,

On top of the speed problem (which is quite bad on its own with higher number of validation images), the pull request contains a fix of an additional problem as well, which leads to incorrect results, see issue #133:

target[target.sum(1) > 0]

Tip: Whenever I debug the train/validation epochs, I usually rewrite the "__len__" function of the dataset inside data_loader.py to return some small value instead of the real one, yielding train and validation epochs to only consist of a couple of batches :)

Best regards,
Adam

@henrytsui000
Copy link
Collaborator

Hi Adam,

Thank you for pointing this out! Regarding the padding target issue, how about adding:

prediction = prediction[prediction[:, 0] != -1]

after line 486:

def to_metrics_format(prediction: Tensor) -> Dict[str, Union[float, Tensor]]:
bbox = {"boxes": prediction[:, 1:5], "labels": prediction[:, 0].int()}
if prediction.size(1) == 6:

I believe this approach is more straightforward for filtering out padding targets than using target.sum(1) > 0. Let me know your thoughts!

Best regards,
Henry Tsui

@Adamusen
Copy link
Contributor Author

Adamusen commented Jan 3, 2025

Hey,

Yes, your proposed solution should work just as well :)

I was considering filtering based on class IDs being -1 as well first, but then I decided to go with the sum instead to make the solution more robust and not dependent on the dataloader's collate_fn filling class IDs with -1 values (not to break the solution if it got removed in the future or something :))

Best regards,
Adam

@henrytsui000
Copy link
Collaborator

Hi Adam,

I’d prefer filtering by -1 because I think the padding class will consistently be -1. In contrast, shifting bounding boxes or certain data augmentations applied by users might more easily accidentally cause the sum of values to > 0, potentially leading to unintended filtering.

Feel free to update the code if you agree, or let me know, and I’ll make the update. It’s up to you!

Best regards,
Henry Tsui

@Adamusen
Copy link
Contributor Author

Adamusen commented Jan 3, 2025

Hey,

Since the collate_fn is applied after the data_augmentations this shouldn't be an issue, as such, I find both solutions equally valid, so I leave it up to you to decide whichever one you prefer :)

Edit: Oh, I think I just got what you meant. Some preprocessing might shift some bounding box coordinates to negative values, which might lead the sum to be less than zero even for a valid label. Even tho such labels are bad on their own, I got what you mean and with that, yes, filtering using class IDs being -1 is a safer option. I'll change the code :)

Best regards,
Adam

Changed dummy target filtering to be based on class IDs being -1 and moved it inside the to_metrics_format function.
@henrytsui000 henrytsui000 merged commit 8094323 into WongKinYiu:main Jan 3, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too slow and incorrect validation metrics
2 participants