Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add focal loss tutorial. #1440

Closed
wants to merge 2 commits into from
Closed

Conversation

us
Copy link
Contributor

@us us commented Mar 26, 2020

Related issue: #361

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

@Squadrick
Copy link
Member

IIRC, focal loss is specifically formulated to help out for detection tasks (like PASCAL). It was also originally introduced for RetinaNet which wasn't doing simple classification.

From the paper:

The Focal Loss is designed to address the one-stage object detection scenario in which there is an extreme imbalance between foreground and background classes during training (e.g., 1:1000).

A tutorial on MNIST classification isn't suitable. It's better to do a simple detection task using a single short detector, or if that's too complex, we can artificially imbalance the dataset to show where focal loss really shines.

@us
Copy link
Contributor Author

us commented Mar 26, 2020

Can we try on mnist that is labels are 1 or not.

@bhack
Copy link
Contributor

bhack commented Mar 26, 2020

I agree with @Squadrick. I think an easy single stage anchor free object detector could be simple enough for a focal loss tutorial.

I.e. see focal loss in https://github.com/xuannianz/keras-CenterNet

@us
Copy link
Contributor Author

us commented Mar 26, 2020

@Squadrick @bhack I changed the dataset, now MNIST have just 3 categories 0, 1, and others, so dataset is imbalanced and loss graph are same as graph on paper.
Training bigger model will cost more time and colab is inefficient for this.(continuous timeout errors, etc.)

@@ -0,0 +1,467 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good to add compile() and fit() in Train and Evaluate and not in Building The Model


Reply via ReviewNB

@aaronmondal
Copy link
Contributor

Tbh I feel like this is may be an unnecessary tutorial.

  1. From a 'code-point-of-view' the usage of the focal loss is essentially just swapping sigmoid_cross_entropy_with_logits against sigmoid_focal_crossentropy anyways. The mathematical intricacies which justify using the focal loss over regular crossentropy are not addressed besides the copy of the SigmoidFocalCrossentropy docstring.
  2. The tfa.__version__ is outdated by over half a year.
  3. IMO one of the mistakes one could make when applying SigmoidFocalCrossentropy is wrong scaling of the model outputs depending on from_logits. This is not addressed in the tutorial.
  4. The parameter alpha is not explained and the image from the paper omits alpha as well, hence the equation from the image does not actually represent the calculations performed by SigmoidFocalCrossentropy.
  5. Gradient instability for small gamma is not addressed.
  6. The create_model method in the tutorial defines a Sequential model, then builds and trains it as well, i.e. it does not just "create a model". I would also argue that rebuilding the entire model just for the sake of changing a loss hyper-parameter is not good practice, as this could be achieved by re-compiling the model as well.

I agree with @bhack that the most prominent use case for the focal loss is object detection. A tutorial on object detection which makes use of the focal loss would probably be a lot more informative. In that case it would also not be such a big problem to be less mathematically precise about the formulation of the focal loss itself.

@bhack
Copy link
Contributor

bhack commented Nov 26, 2020

/cc @WindQAQ @seanpmorgan Probably we need to deprecate focal loss before the next release https://github.com/keras-team/keras-cv/blob/master/keras_cv/losses/focal_loss.py

@bhack
Copy link
Contributor

bhack commented May 10, 2022

We have now a focal loss in Keras CV. I suggest you to port this in the Keras io repository:
https://github.com/keras-team/keras-io

@bhack bhack closed this May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants