Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting datasets to new TorchData API #229

Closed
3 of 17 tasks
fabrizio-ottati opened this issue Dec 27, 2022 · 2 comments
Closed
3 of 17 tasks

Porting datasets to new TorchData API #229

fabrizio-ottati opened this issue Dec 27, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@fabrizio-ottati
Copy link
Collaborator

See #201.

List of datasets to be ported:

  • NMNIST.
  • STMNIST.
  • NCARS.
  • ASLDVS.
  • CIFAR10-DVS.
  • DVGGesture.
  • NCALTECH101.
  • POKERDVS.
  • SMNIST.
  • DVSLip.
  • SHD.
  • SSC.
  • DAVISDATA.
  • DSEC.
  • MVSEC.
  • TUMVIE.
  • VPR.
@biphasic biphasic added the enhancement New feature or request label Dec 27, 2022
@biphasic
Copy link
Member

In the end I think we don't need to convert most of the datasets. For most classification datasets, one can just use an IterableWrapper like so

import tonic
from torchdata.datapipes.iter import IterableWrapper

dp = IterableWrapper(tonic.datasets.NMNIST('data'))
dp = dp.filter(lambda data: data[1]==0) # only get samples with label 0

I think where the new API might make the most sense is for DSEC and TUMVIE, but even for those, DataLoader2 is not super mature yet to justify a migration and if the new API is much more suitable for those, also IterableWrappers can be used.

@biphasic
Copy link
Member

closing this for now because torchdata development is on hold pytorch/data#1196

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants