Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading a dataset from Google Colab hangs at "Resolving data files". #6552

Closed
KelSolaar opened this issue Jan 3, 2024 · 2 comments
Closed

Comments

@KelSolaar
Copy link

Describe the bug

Hello,

I'm trying to load a dataset from Google Colab but the process hangs at Resolving data files:

image

It is happening when the _get_origin_metadata definition is invoked:

def _get_origin_metadata(
    data_files: List[str],
    max_workers=64,
    download_config: Optional[DownloadConfig] = None,
) -> Tuple[str]:
    return thread_map(
        partial(_get_single_origin_metadata, download_config=download_config),
        data_files,
        max_workers=max_workers,
        tqdm_class=hf_tqdm,
        desc="Resolving data files",
        disable=len(data_files) <= 16,

The thread is then stuck at waiter.acquire() in the builtin threading.py file.

I can load the dataset just fine on my machine.

Cheers,

Thomas

Steps to reproduce the bug

In Google Colab:

!pip install datasets
from datasets import load_dataset

dataset = load_dataset("colour-science/color-checker-detection-dataset")

Expected behavior

The dataset should be loaded.

Environment info

  • datasets version: 2.16.1
  • Platform: Linux-6.1.58+-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • huggingface_hub version: 0.20.1
  • PyArrow version: 10.0.1
  • Pandas version: 1.5.3
  • fsspec version: 2023.6.0
@lhoestq
Copy link
Member

lhoestq commented Jan 5, 2024

This bug comes from the huggingface_hub library, see: huggingface/huggingface_hub#1952

A fix is provided at huggingface/huggingface_hub#1953. Feel free to install huggingface_hub from this PR, or wait for it to be merged and the new version of huggingface_hub to be released

@KelSolaar
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants