-
-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create Losses module #1264
create Losses module #1264
Conversation
How about |
mhmh, don't really fancy that, I usually think of metrics as something you keep track of, not optimize |
|
Disclaimer: I don't have a preference here. I proposed
|
Keras has two separate modules, Losses, and Metrics pytorch instead has only losses, with metrics typically hand-coded by users. We currently only have only loss functions here, not arbitrary metrics, so I think |
@DhairyaLGandhi merge? |
CI failure likely due to JuliaDiff/ChainRules.jl#227 |
include("layers/basic.jl") | ||
include("layers/conv.jl") | ||
include("layers/recurrent.jl") | ||
include("layers/normalise.jl") | ||
|
||
include("data/Data.jl") | ||
|
||
include("losses/Losses.jl") | ||
using .Losses # TODO: stop importing Losses in Flux's namespace in v0.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loss should export all the functions as before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, that is no loss is exported but all live in the Flux namespace, as they currently are. The comment is about keep the losses in Flux.Losses namespace, something we may want to do in v0.12
src/losses/Losses.jl
Outdated
|
||
agg((ŷ .- y).^2) | ||
""" | ||
mse(ŷ, y; agg=mean) = agg((ŷ .- y).^2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there is a regression wrt to performance, maybe we should consider removing the aggregation for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you refer to #1255, mean
is not affected by it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean specifically in the backwards pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. But we can't remove aggregation now and add it back later, that's a lot of breakage. We should just improve performances under the hood. If you have a better implementation for agg=mean
we can add a branch for that. But really for such basic functions as mean((x .- x).^2)
we should expect Zygote to be perfomant, and track down the issue if it is not
|
||
CUDA.allowscalar(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this PR moving around tests? The tests should be largely untouched and esp tests unrelated to losses should be untouched
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to reorganize a bit the cuda tests, in order to reflect the presence of the new Losses
module and the new test/cuda/losses.jl file
The package shouldn't move the definition of losses to a module like so. See how Optimisers.jl handles the directory structure. |
I would also consider calling this package Loss.jl or similar otherwise it sounds weird to my ear |
I fear using the singular here may go against julia's guidelines for module naming. Any opinion from @MikeInnes @oxinabox ? |
We should also make sure that we don't message things incorrectly. Currently it seems that we are saying that these bit of code are the loss functions which are somehow separate from being some generic transform that we can differentiate through. Continuing calling them stateless layers makes that a little bit more explicit. This is an important distinction to make for a newcomer to our ecosystem. |
bors try |
Hopefully we haven't missed any tests |
tryBuild succeeded: |
bors r+ |
Build succeeded: |
1287: Add CTC loss to new Losses module r=CarloLucibello a=maetshju This is a redux of adding the connectionist temporal classification loss from #342, now that the Losses module has been merged in #1264. Discussion in #342 suggested that a new PR would be easier than rebasing. Since the last commit in #342, functions and data structures from `CUDAnative.jl` and `CuArrays.jl` have been updated to work with `CUDA.jl`. This is in addition to incorporating the loss function into the Losses module. ### PR Checklist - [X] Tests are added - [X] Entry in NEWS.md - [X] Documentation, if applicable - [ ] Final review from `@dhairyagandhi96` (for API changes). Co-authored-by: Matt Kelley <matthew.curtis.kelley@gmail.com> Co-authored-by: Matthew C. Kelley <matthew.curtis.kelley@gmail.com>
1287: Add CTC loss to new Losses module r=CarloLucibello a=maetshju This is a redux of adding the connectionist temporal classification loss from #342, now that the Losses module has been merged in #1264. Discussion in #342 suggested that a new PR would be easier than rebasing. Since the last commit in #342, functions and data structures from `CUDAnative.jl` and `CuArrays.jl` have been updated to work with `CUDA.jl`. This is in addition to incorporating the loss function into the Losses module. ### PR Checklist - [X] Tests are added - [X] Entry in NEWS.md - [X] Documentation, if applicable - [ ] Final review from `@dhairyagandhi96` (for API changes). Co-authored-by: Matt Kelley <matthew.curtis.kelley@gmail.com> Co-authored-by: Matthew C. Kelley <matthew.curtis.kelley@gmail.com>
Continuation of #1150, grouping losses under a
Losses
module as discussed.An alternative for the module name could be
Loss
, butLosses
seemed more natural.I also renamed
bce
back tobinarycrossentropy
, since now that the function is within a module I could provide I deprecation path without changing the function name with respect to last tagged release.Some of the function contain the
_loss
suffix, (e.g.hinge_loss
,poisson_loss
). We could drop that now that we have a namespace disambiguating the meaning, but again it seems more natural to keep, closer to the way people referer to them when speakingPR Checklist
@MikeInnes
or@dhairyagandhi96
(for API changes).