Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: loading locally downloaded dataset #2056

Merged
merged 1 commit into from
Nov 16, 2024
Merged

Conversation

NanoCode012
Copy link
Collaborator

Description

Fix #1830

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

@winglian
Copy link
Collaborator

If the previous test loading the dataset with snapshot download already worked, then didn't this already work? If not, the new test doesn't really seem to validate the new handling.

@NanoCode012
Copy link
Collaborator Author

If the previous test loading the dataset with snapshot download already worked, then didn't this already work? If not, the new test doesn't really seem to validate the new handling.

Which test do you mean in particular? This test_load_local_hub_with_revision?

If so, it's a bit different and also the reason I got confused with the original issue. The author wants to pass in the path to a local folder (without data_files) like so "path": str(tmp_ds_path),.

The prior test does

                            "path": "mhenrichsen/alpaca_2k_test",
                            "data_files": [
                                f"{tmp_ds_path}/alpaca_2000.parquet",
                            ],

They would enter separate conditions. Another way to test is that, if you revert the data loading changes, the new test would fail while rest passes.

@winglian
Copy link
Collaborator

thanks for the clarification 👍

@winglian winglian merged commit fd70eec into main Nov 16, 2024
12 checks passed
@winglian winglian deleted the fix/local_dataset_loading branch November 16, 2024 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Edge case for local dataset loading if its a folder
2 participants