-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
torch
/jax
Dataloader support
#55
Conversation
torch
/jax
Dataloader supporttorch
/jax
Dataloader support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also add a simple as_dataloader
function to provide out of the box utilities at this point.
Nevertheless it is a good PR, I could nitpick/disagree a bit on the implementation as I wanted to do some dynamic inheritance based on the available packages to automatically define the return object type but simpler is better down the line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good, with the conversion to tensor getting a dataset into a dataloader should be a one line (for real this time), but I still think that we should have a dummy as_iter
method to return a default dataloader.
In any case thank you for the bug fixing and the work! I'll probably try to clean up the conditional import of torch and jax on my end somehow but the PR is 🔥
Currently, getting a torch_geometric Dataloader is quite complicated since it requires multiple steps in the process. This PR addresses those concerns by introducing,
with_format
of huggingface datasets Link. This now allows us to getnumpy
(default),torch
orjax
arrays from__getitem__
call.For example,
torch_geometric
Data object, instead of the sklearn Bunch from getitem, it might be convenient to use a function on the data bunch returned.Note: I looked through huggingface datasets, they too do not have any such method to get a dataloader from their datasets.
TODOs
array_format
functionality. Will also add a test for the transform.@prtos @FNTwin
Checklist: