-
-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataLoader with NamedTuple #1221
Conversation
e1020c1
to
223a68b
Compare
I like this. Interaction with |
Rather than the Unions, would it make more sense have the NamedTuple constructors just forward to the regular ones. Is there any other functional property that we are using here that warrants NamedTuple? Maybe it would be better to just expose these as kwargs in the first place, something like |
@CarloLucibello Hmm that's true. Maybe it's more consistent if we always call @dhairyagandhi96 I don't understand? The idea is to be able to refer to the tensors by name, which can't be done if you convert the NamedTuple to a Tuple. Maybe I missunderstood your comment. |
Looking from the code, the naming of said tensors is to allow users some convenience while sending the input, right? Or is the intention for the keys to be used inside the loss function? |
I am using the keys inside the loss function (but I am also writing a training loop by hand). I think this could be a general use-case. |
What do you think of #1227? I could try a fix here. |
that should be fixed, but better do it in a separate PR |
In the end, DataLoader should end up supporting any type with some dataset-like interface. Changes here only involve overloading |
end | ||
|
||
_getobs(data::Tuple, i) = map(x -> _getobs(x, i), data) | ||
_getobs(data::AbstractArray, i) = data[ntuple(i -> Colon(), Val(ndims(data) - 1))..., i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the N
from the type and drop the Val
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, but why is that better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's largely for it to be cleaner, doing it like this doesn't seem to add any benefit and increases the complexity of the code while reading it
Merge? |
needs a rebase |
accept suggested changes Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
selectdim can lead to type instability, see https://discourse.julialang.org/t/why-selectdim-is-type-instable/25271/5
rebased |
There are no 'minor' API changes that are allowed to go through without review; there are just API changes, and this is one of them. (The fact that you added documentation for it should be a giveaway). I think this addition is fine though. It might be nice to generalise it (we could potentially reuse
It would be helpful to add a test to make sure the behaviour is right, but I don't think this is the case, eg: julia> +((a=1,b=2)...)
3 |
@MikeInnes The problem with Line 88 in 254e4a7
doesn't propagate the tensor names. So the user has to be careful to define the |
Ah, I misread Carlo's post. Yes, if we want to avoid that we'd have to do something significantly more complex and it's better to keep this simple. One option is to write |
In any case this |
Depends; this PR is something of a decision point, because if we wanted @CarloLucibello can decide if he's happy with the details but the current API change LGTM, anyway. |
Once #1149 implementing #1149 (comment) gets merged, Let's merge this |
I do like |
Build succeeded: |
Just a couple of small changes, so that
DataLoader
can be created with aNamedTuple
of tensors instead ofTuple
. This way the tensors can be referred to by name. For exampleIf we only use tuples, then in datasets with multiple tensors one has to be careful about the order in which the tensors are fed into the
DataLoader
constructor and be consistent with this elsewhere. WithNamedTuples
one just have to be consistent about the names used, which I think is a minor improvement.CC @CarloLucibello
PR Checklist
I don't think this qualifies as an API change. It's just a minor feature addition. So final review probably not required.
@MikeInnes
or@dhairyagandhi96
(for API changes).