Torchrl data omission #12

dwsmart32 · 2024-01-24T21:43:11Z

Hi, I saw that you had uploaded the dataset to torchrl repository recently. It is amazing that I can access easily with torch tensordict. However I am writing to report an issue I've encountered while attempting to download the main->humanoid_walk->medium-replay dataset from the V-D4RL benchmarks through torchrl.

(I directly pulled torchrl package from github repo and also tensordict too , pip install seems like it hasnt been updated yet)

(…)03b080197e0b44c08694cb699fff5ce6-501.npz: 100%|██████████████████████████████████| 2.30M/2.30M [00:20<00:00, 112kB/s]
file=/tmp/tmpdnm5926d/datasets--conglu--vd4rl/snapshots/6001dd3a96d44c22e2a6c5c8f937ba0f840c4d50/vd4rl/main/humanoid_wal
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[1], line 11
      9         for pixel in [64, 84]:
     10             print(f'task, type, pixel:, {task}, {type}, {pixel}')
---> 11             d = VD4RLExperienceReplay(f"main/{task}/{type}/{pixel}px", batch_size=4, image_size=50, download='force')
     12 for batch in d:   
     13     print(batch)

File ~/anaconda3/envs/vd4rl/lib/python3.9/site-packages/torchrl/data/datasets/vd4rl.py:200, in VD4RLExperienceReplay.__init__(self, dataset_id, batch_size, root, download, sampler, writer, collate_fn, pin_memory, prefetch, transform, split_trajs, totensor, image_size, **env_kwargs)
    198         except FileNotFoundError:
    199             pass
--> 200     storage = self._download_and_preproc(dataset_id, data_path=self.data_path)
    201 elif self.split_trajs and not os.path.exists(self.data_path):
    202     storage = self._make_split()

File ~/anaconda3/envs/vd4rl/lib/python3.9/site-packages/torchrl/data/datasets/vd4rl.py:308, in VD4RLExperienceReplay._download_and_preproc(cls, dataset_id, data_path)
    306             td_save = tdc[0]
    307         tds.append(td)
--> 308         total_steps += td.shape[0]
    310 # From this point, the local paths are non needed anymore
    311 td_save = td_save.expand(total_steps).memmap_like(data_path, num_threads=32)

IndexError: tuple index out of range

This issue has prevented me from successfully downloading only the humanoid-medium-replay dataset. I've verified that my setup and versions are compatible as per the documentation, yet the problem persists. I think some files(maybe .npz files) has been omitted for some reason in hugging_face hub or somewhere.

Could you please look into this matter? Any guidance on resolving this error or confirming whether this might be a known issue with a potential workaround would be highly appreciated.

Thank you very much for your time and assistance.

The text was updated successfully, but these errors were encountered:

conglu1997 · 2024-01-25T00:56:30Z

Hi, the torchrl addition was very kindly added by @vmoens in pytorch/rl#1756. Would you be happy to raise an issue in that repository?

Thanks so much!

conglu1997 closed this as completed Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torchrl data omission #12

Torchrl data omission #12

dwsmart32 commented Jan 24, 2024 •

edited

Loading

conglu1997 commented Jan 25, 2024 •

edited

Loading

Torchrl data omission #12

Torchrl data omission #12

Comments

dwsmart32 commented Jan 24, 2024 • edited Loading

conglu1997 commented Jan 25, 2024 • edited Loading

dwsmart32 commented Jan 24, 2024 •

edited

Loading

conglu1997 commented Jan 25, 2024 •

edited

Loading