Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add epsilon for NaN prevention + test case to RetinaNet #1684

Merged
merged 3 commits into from
May 12, 2023

Conversation

LukeWood
Copy link
Contributor

@LukeWood LukeWood commented Apr 7, 2023

fixes #1683

I actually took a tf.cond() based approach here to zero out sample weights instead. I do this because I found that when we divide by normalizer, and normalizer is zero, the background classes all get absolutely massive values for class weight - causing massive gradients.

An alternative is perhaps to use a small value, i.e. 0.3~ for the class weights? I think the ideal mask is something like: tf.where(positive_mask, 0.1, 0.0) where 0.1 is any tiny value. Though, using all zeros is probably safer and easier.

@LukeWood LukeWood requested a review from ianstenbit April 7, 2023 05:28
@LukeWood
Copy link
Contributor Author

LukeWood commented Apr 7, 2023

/gcbrun

Copy link
Contributor

@ianstenbit ianstenbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome -- thank you Luke!

I'll try this same approach for YOLOV8

@LukeWood
Copy link
Contributor Author

LukeWood commented Apr 7, 2023

/gcbrun

@LukeWood
Copy link
Contributor Author

/gcbrun

@LukeWood
Copy link
Contributor Author

Looks like we need to upgrade tensorflow

@LukeWood
Copy link
Contributor Author

/gcbrun

@LukeWood LukeWood merged commit 0f1603d into keras-team:master May 12, 2023
@LukeWood LukeWood deleted the epsilon branch May 12, 2023 00:40
jbischof pushed a commit to jbischof/keras-cv that referenced this pull request May 18, 2023
)

* add epsilon test case

* Drop batches with no values

* Change to batch
ghost pushed a commit to y-vectorfield/keras-cv that referenced this pull request Nov 16, 2023
)

* add epsilon test case

* Drop batches with no values

* Change to batch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid NaNs in RetinaNet training by adding an epsilon to normalizer
2 participants