Add epsilon for NaN prevention + test case to RetinaNet #1684
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fixes #1683
I actually took a
tf.cond()
based approach here to zero out sample weights instead. I do this because I found that when we divide by normalizer, and normalizer is zero, the background classes all get absolutely massive values for class weight - causing massive gradients.An alternative is perhaps to use a small value, i.e. 0.3~ for the class weights? I think the ideal mask is something like:
tf.where(positive_mask, 0.1, 0.0)
where 0.1 is any tiny value. Though, using all zeros is probably safer and easier.