Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use extended path on Windows when downloading to local dir #2378

Merged
merged 18 commits into from
Jul 11, 2024
Merged

Use extended path on Windows when downloading to local dir #2378

merged 18 commits into from
Jul 11, 2024

Conversation

mlinke-ai
Copy link
Contributor

Change the path of the local dir to an extended path by prepending "\?" to the absolute path, when the absolute path is longer than 255 characters on Windows.

This addresses Issue #2374.

Also fixed a small typo.

Change the path of the local dir to an extended path by prepending
"\\?\" to the absolute path, when the absolute path is longer
than 255 characters on Windows.

Also fixed a small typo.
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this PR @mlinke-ai! I left a comment to make sure implementation is as robust as possible. Let me know what you think :)

# Some Windows versions do not allow for paths longer than 255 characters.
# In this case, we must specify it as an extended path by using the "\\?\" prefix.
if os.name == "nt" and len(os.path.abspath(local_dir)) > 255:
local_dir = "\\\\?\\" + os.path.abspath(local_dir)
local_dir = Path(local_dir)
paths = get_local_download_paths(local_dir=local_dir, filename=filename)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be cases where os.path.abspath(local_dir) is smaller than 255 characters but the lock file or file path not. What I would do to be safer is to move this logic to get_local_download_paths internals itself. If lock_path is >= 255 characters, then you should prefixed all paths with \\?\ (if not already the case). In practice lock_path is "guaranteed" to be the longest path so if this one is ok, all of them are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, having the path handling in one location is a much better approach. I have pushed some changes following your recommendation.

Maybe we also need some tests for get_local_download_paths()?

mlinke-ai and others added 4 commits July 10, 2024 17:58
Change the path of the local dir to an extended path by prepending
"\\?\" to the absolute path, when the absolute path is longer
than 255 characters on Windows.

Also fixed a small typo.
On Windows we check the length of `lock_path` and if it is longer than
255 characters we prepend the `\\?\` prefix to all paths if it does not
already exist.

We only need to check the length of `lock_path` because it is guaranteed
to be the longest path.
…uggingface_hub into download-using-extended-path
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @mlinke-ai! Adding tests would be awesome. You can do that in ./tests/test_local_folder.py. The test should differentiate the expected output depending on if it runs on Windows or not.

Comment on lines +1423 to +1426
# Some Windows versions do not allow for paths longer than 255 characters.
# In this case, we must specify it as an extended path by using the "\\?\" prefix.
if os.name == "nt" and len(os.path.abspath(local_dir)) > 255:
local_dir = "\\\\?\\" + os.path.abspath(local_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can now be removed! :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed!

# Some Windows versions do not allow for paths longer than 255 characters.
# In this case, we must specify it as an extended path by using the "\\?\" prefix
if os.name == "nt":
if len(os.path.abspath(lock_path)) > 255:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(os.path.abspath(lock_path)) > 255:
if len(os.path.abspath(lock_path)) > 255 and not str(local_dir).startwith("\\\\?\\"):

Let's check the length of lock_path and the prefix of local_dir only once. If the condition above is true, then all 3 paths must be changed. The logic is the same but we about the 3 xxx_path = xxx_path if ... else Path(...) below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about we change the order of the tests:

if os.name == "nt" and not str(local_dir).startswith("\\\\?\\"):
    if len(os.path.abspath(lock_path)) > 255:
        file_path = Path("\\\\?\\" + os.path.abspath(file_path))
        lock_path = Path("\\\\?\\" + os.path.abspath(lock_path))
        metadata_path = Path("\\\\?\\" + os.path.abspath(metadata_path))

In this case we only need to check the length of the path if the user has not already specified the \\?\ prefix. Because we only need to care about the length if the prefix is not specified.

Copy link
Contributor

@Wauplin Wauplin Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are correct so feel free to put the order you feel is the more appropriate :) Much cleaner like this in any case!

(EDIT: slight preference to either have a single if statement or have two if statement with the first one been if os.name == "nt" alone)

qgallouedec and others added 2 commits July 11, 2024 13:07
* Fix token=False not respected in file download

* lint
Comment on lines +1423 to +1426
# Some Windows versions do not allow for paths longer than 255 characters.
# In this case, we must specify it as an extended path by using the "\\?\" prefix.
if os.name == "nt" and len(os.path.abspath(local_dir)) > 255:
local_dir = "\\\\?\\" + os.path.abspath(local_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed!

Wauplin and others added 8 commits July 11, 2024 16:08
…el` (#2373)

* Handle shared layers in save_torch_state_dict + save_torch_model + some helpers

* fix pytest rerun

* more reruns
…et/Space) (#2333)

* First draft to support `expand` parameter for models

* add expand support for dataset

* add expand support for Space
Change the path of the local dir to an extended path by prepending
"\\?\" to the absolute path, when the absolute path is longer
than 255 characters on Windows.

Also fixed a small typo.
On Windows we check the length of `lock_path` and if it is longer than
255 characters we prepend the `\\?\` prefix to all paths if it does not
already exist.

We only need to check the length of `lock_path` because it is guaranteed
to be the longest path.
Change the path of the local dir to an extended path by prepending
"\\?\" to the absolute path, when the absolute path is longer
than 255 characters on Windows.

Also fixed a small typo.
…uggingface_hub into download-using-extended-path
tests/test_local_folder.py Outdated Show resolved Hide resolved
mlinke-ai and others added 2 commits July 11, 2024 18:17
The test now shows up a `skipped` if executed on a non-Windows machine

Co-authored-by: Lucain <lucainp@gmail.com>
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on it @mlinke-ai! Looks good to me now :) Let's wait for the CI to pass and it should be good to merge!

EDIT: code quality seems to fail. Could you install pip install -e ".[dev]" locally and then run make style to fix them. To check everything is fixed, you can run make quality. Then commit the changes and you're good!

@mlinke-ai
Copy link
Contributor Author

No problem.

Please excuse the back and forth; this is actually my first PR.

@Wauplin
Copy link
Contributor

Wauplin commented Jul 11, 2024

No problem at all! Going back and forth is the essence of open-source collaboration :) Glad to know your first PR is on huggingface_hub's repo 🤗

@Wauplin
Copy link
Contributor

Wauplin commented Jul 11, 2024

Failing tests are unrelated to this PR so it's good to merge! 🚀

@Wauplin Wauplin merged commit 7fd7fcf into huggingface:main Jul 11, 2024
11 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants