Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support numpy/torch/tf/jax formatting for IterableDataset #5083

Closed
lhoestq opened this issue Oct 6, 2022 · 2 comments
Closed

Support numpy/torch/tf/jax formatting for IterableDataset #5083

lhoestq opened this issue Oct 6, 2022 · 2 comments
Assignees
Labels
enhancement New feature or request good second issue Issues a bit more difficult than "Good First" issues streaming

Comments

@lhoestq
Copy link
Member

lhoestq commented Oct 6, 2022

Right now IterableDataset doesn't do any formatting.

In particular this code should return a numpy array:

from datasets import load_dataset

ds = load_dataset("imagenet-1k", split="train", streaming=True).with_format("np")
print(next(iter(ds))["image"])

Right now it returns a PIL.Image.

Setting streaming=False does return a numpy array after #5072

@lhoestq lhoestq added enhancement New feature or request streaming labels Oct 6, 2022
@lhoestq lhoestq self-assigned this Oct 6, 2022
@lhoestq lhoestq added the good second issue Issues a bit more difficult than "Good First" issues label Feb 17, 2023
@zutarich
Copy link

zutarich commented Oct 9, 2023

hii @lhoestq, can you assign this issue to me? Though i am new to open source still I would love to put my best foot forward. I can see there isn't anyone right now assigned to this issue.

@lhoestq
Copy link
Member Author

lhoestq commented Oct 9, 2023

Hi @zutarich ! This issue was fixed by #5852 - sorry I forgot to close it

Feel free to look for other issues and ping me or @mariosasko if you have questions :)
Also let us know if we can help find an issue that can correspond to what you're looking for

@lhoestq lhoestq closed this as completed Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good second issue Issues a bit more difficult than "Good First" issues streaming
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants