Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch-loader(example): use prefetch and try to run example in linux #691

Merged
merged 1 commit into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions examples/get_started/torch-loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

"""

import multiprocessing
import os
from posixpath import basename

Expand Down Expand Up @@ -54,6 +55,7 @@ def forward(self, x):
if __name__ == "__main__":
ds = (
DataChain.from_storage(STORAGE, type="image")
.settings(cache=True, prefetch=25)
.filter(C("file.path").glob("*.jpg"))
.map(
label=lambda path: label_to_int(basename(path)[:3], CLASSES),
Expand All @@ -64,8 +66,9 @@ def forward(self, x):

train_loader = DataLoader(
ds.to_pytorch(transform=transform),
batch_size=16,
num_workers=2,
batch_size=25,
num_workers=4,
multiprocessing_context=multiprocessing.get_context("spawn"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fsspec's loop is not fork-safe. Even though we create a new loop for each forked processes, s3fs or other filesystems may not be fork-safe.

See https://s3fs.readthedocs.io/en/latest/#multiprocessing.

This causes future from a run_coroutine_threadsafe to hang forever on prefetch.

I could contribute a fix to the first problem, but the second problem still remains.
Also, Python 3.14 is changing default start method for posix systems (except macOS which uses 'spawn') to 'forkserver'. See python/cpython#84559.

So, I think it's better to recommend to use a different start method.

)

model = CNN()
Expand Down
9 changes: 1 addition & 8 deletions tests/examples/test_examples.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,7 @@

import pytest

get_started_examples = sorted(
[
filename
for filename in glob.glob("examples/get_started/**/*.py", recursive=True)
# torch-loader will not finish within an hour on Linux runner
if "torch" not in filename or os.environ.get("RUNNER_OS") != "Linux"
]
)
get_started_examples = sorted(glob.glob("examples/get_started/**/*.py", recursive=True))

llm_and_nlp_examples = sorted(glob.glob("examples/llm_and_nlp/**/*.py", recursive=True))

Expand Down
Loading