Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torchrl data omission #12

Closed
dwsmart32 opened this issue Jan 24, 2024 · 1 comment
Closed

Torchrl data omission #12

dwsmart32 opened this issue Jan 24, 2024 · 1 comment

Comments

@dwsmart32
Copy link

dwsmart32 commented Jan 24, 2024

Hi, I saw that you had uploaded the dataset to torchrl repository recently. It is amazing that I can access easily with torch tensordict. However I am writing to report an issue I've encountered while attempting to download the main->humanoid_walk->medium-replay dataset from the V-D4RL benchmarks through torchrl.

(I directly pulled torchrl package from github repo and also tensordict too , pip install seems like it hasnt been updated yet)

(…)03b080197e0b44c08694cb699fff5ce6-501.npz: 100%|██████████████████████████████████| 2.30M/2.30M [00:20<00:00, 112kB/s]
file=/tmp/tmpdnm5926d/datasets--conglu--vd4rl/snapshots/6001dd3a96d44c22e2a6c5c8f937ba0f840c4d50/vd4rl/main/humanoid_wal
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[1], line 11
      9         for pixel in [64, 84]:
     10             print(f'task, type, pixel:, {task}, {type}, {pixel}')
---> 11             d = VD4RLExperienceReplay(f"main/{task}/{type}/{pixel}px", batch_size=4, image_size=50, download='force')
     12 for batch in d:   
     13     print(batch)

File ~/anaconda3/envs/vd4rl/lib/python3.9/site-packages/torchrl/data/datasets/vd4rl.py:200, in VD4RLExperienceReplay.__init__(self, dataset_id, batch_size, root, download, sampler, writer, collate_fn, pin_memory, prefetch, transform, split_trajs, totensor, image_size, **env_kwargs)
    198         except FileNotFoundError:
    199             pass
--> 200     storage = self._download_and_preproc(dataset_id, data_path=self.data_path)
    201 elif self.split_trajs and not os.path.exists(self.data_path):
    202     storage = self._make_split()

File ~/anaconda3/envs/vd4rl/lib/python3.9/site-packages/torchrl/data/datasets/vd4rl.py:308, in VD4RLExperienceReplay._download_and_preproc(cls, dataset_id, data_path)
    306             td_save = tdc[0]
    307         tds.append(td)
--> 308         total_steps += td.shape[0]
    310 # From this point, the local paths are non needed anymore
    311 td_save = td_save.expand(total_steps).memmap_like(data_path, num_threads=32)

IndexError: tuple index out of range

This issue has prevented me from successfully downloading only the humanoid-medium-replay dataset. I've verified that my setup and versions are compatible as per the documentation, yet the problem persists. I think some files(maybe .npz files) has been omitted for some reason in hugging_face hub or somewhere.

Could you please look into this matter? Any guidance on resolving this error or confirming whether this might be a known issue with a potential workaround would be highly appreciated.

Thank you very much for your time and assistance.

@conglu1997
Copy link
Owner

conglu1997 commented Jan 25, 2024

Hi, the torchrl addition was very kindly added by @vmoens in pytorch/rl#1756. Would you be happy to raise an issue in that repository?

Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants