Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expanduser in save_to_disk #5651

Closed
RmZeta2718 opened this issue Mar 20, 2023 · 5 comments · Fixed by #6098
Closed

expanduser in save_to_disk #5651

RmZeta2718 opened this issue Mar 20, 2023 · 5 comments · Fixed by #6098
Assignees
Labels
good first issue Good for newcomers

Comments

@RmZeta2718
Copy link

RmZeta2718 commented Mar 20, 2023

Describe the bug

save_to_disk() does not expand ~

  1. dataset = load_datasets("any dataset")
  2. dataset.save_to_disk("~/data")
  3. a folder named "~" created in current folder
  4. FileNotFoundError is raised, because the expanded path does not exist (/home/<user>/data)

related issue huggingface/transformers#10628

Steps to reproduce the bug

As described above.

Expected behavior

expanduser correctly

Environment info

  • datasets 2.10.1
  • python 3.10
@mariosasko
Copy link
Collaborator

save_to_disk should indeed expand ~. Marking it as a "good first issue".

@mariosasko mariosasko added the good first issue Good for newcomers label Mar 24, 2023
@benjaminbrown038
Copy link

benjaminbrown038 commented Mar 25, 2023

#self-assign

File path to code:

https://github.com/huggingface/datasets/blob/2.13.0/src/datasets/arrow_dataset.py#L1364

@RmZeta2718 I created a pull request for this issue.

@ashikshafi08
Copy link

Hello,
It says save_to_disk is deprecated in 2.8.0, so the alternative to this will be storage_options?

https://huggingface.co/docs/datasets/package_reference/main_classes#datasets.Dataset.save_to_disk

@Unknown3141592
Copy link
Contributor

@ashikshafi08 I think you misunderstood the warning. The method save_to_disk is not deprecated only the optional parameter fs.
Also @benjaminbrown038 as I cannot find your PR I would like to work on this if you don't mind.

@RmZeta2718
Copy link
Author

@mariosasko It's been several months and the PR is not reviewed. Could you please take a look? I assume this is not complicated and could be merged fairly soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants