Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create Losses module #1264

Merged
merged 10 commits into from
Jul 9, 2020
Merged

create Losses module #1264

merged 10 commits into from
Jul 9, 2020

Conversation

CarloLucibello
Copy link
Member

Continuation of #1150, grouping losses under a Losses module as discussed.

An alternative for the module name could be Loss, but Losses seemed more natural.

I also renamed bce back to binarycrossentropy, since now that the function is within a module I could provide I deprecation path without changing the function name with respect to last tagged release.

Some of the function contain the _loss suffix, (e.g. hinge_loss, poisson_loss). We could drop that now that we have a namespace disambiguating the meaning, but again it seems more natural to keep, closer to the way people referer to them when speaking

PR Checklist

  • Tests are added
  • Entry in NEWS.md
  • Documentation, if applicable
  • Final review from @MikeInnes or @dhairyagandhi96 (for API changes).

@johnnychen94
Copy link
Contributor

johnnychen94 commented Jul 2, 2020

An alternative for the module name could be Loss, but Losses seemed more natural.

How about Metrics as it can be a broader concept than Losses?

@CarloLucibello
Copy link
Member Author

How about Metrics as it can be a broader concept than Losses?

mhmh, don't really fancy that, I usually think of metrics as something you keep track of, not optimize

@CarloLucibello CarloLucibello added this to the v0.11 milestone Jul 2, 2020
This was referenced Jul 2, 2020
@DhairyaLGandhi
Copy link
Member

Metrics sounds better to my ear as well, the metrics that you mentioned also just sounds like the tracking information that would just be used to see training progress,

@johnnychen94
Copy link
Contributor

johnnychen94 commented Jul 3, 2020

Disclaimer: I don't have a preference here. I proposed Metrics because

  • it is not uncommon to track losses during training, using Metrics provides a place for other non-loss metrics, such as accuracy, psnr, ssim, etc.
  • Losses has three s and I sometimes incorrectly type it as 😢 loses.

@CarloLucibello
Copy link
Member Author

CarloLucibello commented Jul 3, 2020

Keras has two separate modules, Losses, and Metrics
https://keras.io/api/metrics/
"A metric is a function that is used to judge the performance of your model.
Metric functions are similar to loss functions, except that the results from evaluating a metric are not used when training the model. Note that you may use any loss functions as a metric function."

pytorch instead has only losses, with metrics typically hand-coded by users.

We currently only have only loss functions here, not arbitrary metrics, so I think Losses would be a more appropriate name, something which conveys a clear sense of the primary purpose. If the concern is about future expandability, we could add a Metrics module at any point, possibly re-exporting Losses as well.

@CarloLucibello
Copy link
Member Author

@DhairyaLGandhi merge?

@CarloLucibello
Copy link
Member Author

CI failure likely due to JuliaDiff/ChainRules.jl#227

include("layers/basic.jl")
include("layers/conv.jl")
include("layers/recurrent.jl")
include("layers/normalise.jl")

include("data/Data.jl")

include("losses/Losses.jl")
using .Losses # TODO: stop importing Losses in Flux's namespace in v0.12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loss should export all the functions as before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, that is no loss is exported but all live in the Flux namespace, as they currently are. The comment is about keep the losses in Flux.Losses namespace, something we may want to do in v0.12


agg((ŷ .- y).^2)
"""
mse(ŷ, y; agg=mean) = agg((ŷ .- y).^2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is a regression wrt to performance, maybe we should consider removing the aggregation for now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you refer to #1255, mean is not affected by it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean specifically in the backwards pass.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which regression?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is with Flux@0.10.4. cmse is the one from master, mymse is a current implementation that I am working on.

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. But we can't remove aggregation now and add it back later, that's a lot of breakage. We should just improve performances under the hood. If you have a better implementation for agg=mean we can add a branch for that. But really for such basic functions as mean((x .- x).^2) we should expect Zygote to be perfomant, and track down the issue if it is not


CUDA.allowscalar(false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this PR moving around tests? The tests should be largely untouched and esp tests unrelated to losses should be untouched

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to reorganize a bit the cuda tests, in order to reflect the presence of the new Losses module and the new test/cuda/losses.jl file

@DhairyaLGandhi
Copy link
Member

The package shouldn't move the definition of losses to a module like so. See how Optimisers.jl handles the directory structure.

@DhairyaLGandhi
Copy link
Member

DhairyaLGandhi commented Jul 6, 2020

I would also consider calling this package Loss.jl or similar otherwise it sounds weird to my ear

src/losses/Losses.jl Outdated Show resolved Hide resolved
@CarloLucibello
Copy link
Member Author

I would also consider calling this package Loss.jl or similar otherwise it sounds weird to my ear

I fear using the singular here may go against julia's guidelines for module naming. Any opinion from @MikeInnes @oxinabox ?

@DhairyaLGandhi
Copy link
Member

We should also make sure that we don't message things incorrectly. Currently it seems that we are saying that these bit of code are the loss functions which are somehow separate from being some generic transform that we can differentiate through. Continuing calling them stateless layers makes that a little bit more explicit. This is an important distinction to make for a newcomer to our ecosystem.

@DhairyaLGandhi
Copy link
Member

bors try

bors bot added a commit that referenced this pull request Jul 9, 2020
@DhairyaLGandhi
Copy link
Member

Hopefully we haven't missed any tests

@bors
Copy link
Contributor

bors bot commented Jul 9, 2020

try

Build succeeded:

@CarloLucibello
Copy link
Member Author

bors r+

@bors
Copy link
Contributor

bors bot commented Jul 9, 2020

Build succeeded:

@bors bors bot merged commit 683d580 into master Jul 9, 2020
@maetshju maetshju mentioned this pull request Jul 21, 2020
4 tasks
@CarloLucibello CarloLucibello deleted the cl/losses branch January 7, 2021 08:46
bors bot added a commit that referenced this pull request Jan 20, 2021
1287: Add CTC loss to new Losses module r=CarloLucibello a=maetshju

This is a redux of adding the connectionist temporal classification loss from #342, now that the Losses module has been merged in #1264. Discussion in #342 suggested that a new PR would be easier than rebasing.

Since the last commit in #342, functions and data structures from `CUDAnative.jl` and `CuArrays.jl` have been updated to work with `CUDA.jl`. This is in addition to incorporating the loss function into the Losses module.

### PR Checklist

- [X] Tests are added
- [X] Entry in NEWS.md
- [X] Documentation, if applicable
- [ ] Final review from `@dhairyagandhi96` (for API changes).


Co-authored-by: Matt Kelley <matthew.curtis.kelley@gmail.com>
Co-authored-by: Matthew C. Kelley <matthew.curtis.kelley@gmail.com>
bors bot added a commit that referenced this pull request Jan 20, 2021
1287: Add CTC loss to new Losses module r=CarloLucibello a=maetshju

This is a redux of adding the connectionist temporal classification loss from #342, now that the Losses module has been merged in #1264. Discussion in #342 suggested that a new PR would be easier than rebasing.

Since the last commit in #342, functions and data structures from `CUDAnative.jl` and `CuArrays.jl` have been updated to work with `CUDA.jl`. This is in addition to incorporating the loss function into the Losses module.

### PR Checklist

- [X] Tests are added
- [X] Entry in NEWS.md
- [X] Documentation, if applicable
- [ ] Final review from `@dhairyagandhi96` (for API changes).


Co-authored-by: Matt Kelley <matthew.curtis.kelley@gmail.com>
Co-authored-by: Matthew C. Kelley <matthew.curtis.kelley@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants