Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace os.path.sep.join path manipulations with a helper #2446

Merged
merged 2 commits into from
Feb 26, 2024

Conversation

akx
Copy link
Contributor

@akx akx commented Feb 14, 2024

What does this PR do?

os.path.sep.join itself is a bit of an antipattern (os.path.join is the function generally meant for joining path components), but it was really used in N repeated places to get a path relative to the package's root.

This PR cleans those up to a simple helper (that returns a modern pathlib.Path to boot).

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
    • No user-facing changes.
  • Did you write any new necessary tests?
    • The new helper is exercised by tests itself.

Copy link
Collaborator

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we have this (and don't use pathlib) is iirc pathlib has issues on windows, no? As a result we use os here as it's device/os agnostic.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@akx
Copy link
Contributor Author

akx commented Feb 15, 2024

The reason we have this (and don't use pathlib) is iirc pathlib has issues on windows, no?

To the best of my knowledge, pathlib is absolutely fine on any OS supported by Python.

Besides, accelerate already uses pathlib both in tests and core code anyway.

@akx
Copy link
Contributor Author

akx commented Feb 15, 2024

cc @BenjaminBossan as well, I guess?

@BenjaminBossan
Copy link
Member

AFAIK, there are no issues with pathlib.Path and Windows, maybe there was something that has been fixed in recent years?

Anyway, I like the solution. The question is probably if we want to rely more on pathlib.Path or do good old paths as strings, which is a bit more fiddly but more contributors will be familiar with that.

Copy link
Collaborator

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenjaminBossan personally I'm a much bigger fan of pathlib, just letting your big PR's sit a day or two as I mull it over as they are quite large changes 😅

@akx
Copy link
Contributor Author

akx commented Feb 15, 2024

Well, as said, the repository is already mixing and matching pathlib.Paths and os.path everywhere.

There's a Ruff lint family PTH that will suggest pathlib versions of os.path things.

@BenjaminBossan
Copy link
Member

Well, as said, the repository is already mixing and matching pathlib.Paths and os.path everywhere.

Yes, for sure :) If Zach is in favor of pathlib.Path then I see no reason not to advance with this approach.

@muellerzr muellerzr merged commit 065d887 into huggingface:main Feb 26, 2024
23 checks passed
@muellerzr
Copy link
Collaborator

This looks to have broken quite a few tests. Looking into it now

FAILED tests/test_metrics.py::MetricTester::test_metric_accelerator_multi - TypeError: sequence item 2: expected str instance, PosixPath found
FAILED tests/test_multigpu.py::MultiDeviceTester::test_distributed_data_loop - TypeError: sequence item 2: expected str instance, PosixPath found
FAILED tests/test_multigpu.py::MultiDeviceTester::test_multi_device - TypeError: sequence item 2: expected str instance, PosixPath found
FAILED tests/test_multigpu.py::MultiDeviceTester::test_multi_device_ops - TypeError: sequence item 2: expected str instance, PosixPath found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants