-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address failing _check_disk_space() when path doesn't exist yet #1692
Address failing _check_disk_space() when path doesn't exist yet #1692
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Hi @martinbrose thanks for looking into this! I think the problem is not with the confusion between absolute or relative paths. In theory Here is what I think we should do:
This would give an implementation looking like this: target_dir = Path(target_dir) # format as `Path`
for path in [target_dir] + list(target_dir.parents): # first check target_dir, then each parents one by one
try:
target_dir_free = shutil.disk_usage(path).free
if target_dir_free < expected_size:
warnings.warn(
"Not enough free disk space to download the file. "
f"The expected file size is: {expected_size / 1e6:.2f} MB. "
f"The target location {target_dir} only has {target_dir_free / 1e6:.2f} MB free disk space."
)
return
except OSError: # raise on anything: file does not exist or space disk cannot be checked
pass # never raise. It's best to check parent folder instead Corresponding tests could be extended (1 for missing folder, 1 for relative path). Do you want to take care of it @martinbrose ? Thanks in advance! |
HI @Wauplin, thanks for your reply. You seem to be right... just double checked myself. I'll have a look tonight and update the PR tonight. |
Hi @Wauplin, your code recommendation makes sense... implemented and tested it. I also added the two additional test cases. While I'm kind of confident with the test case for the non-existent path, I'm not sure I understand exactly the ask for a test case for relative paths. Maybe I addressed it, maybe not. Please let me know. |
tests/test_file_download.py
Outdated
def test_disk_usage_warning_with_non_existent_path(self) -> None: | ||
# Test for not existent path | ||
with warnings.catch_warnings(record=True) as w: | ||
# Cause all warnings to always be triggered. | ||
warnings.simplefilter("always") | ||
_check_disk_space(expected_size=self.expected_size, target_dir="not_existent_path") | ||
assert len(w) == 0 | ||
|
||
# Test for relative path | ||
with warnings.catch_warnings(record=True) as w: | ||
# Cause all warnings to always be triggered. | ||
warnings.simplefilter("always") | ||
_check_disk_space(expected_size=self.expected_size, target_dir="./not_existent_path") | ||
assert len(w) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's exactly how I was seeing it except that 1 path must be relative (let's say "path/to/not_existent_path"
) and the other one absolute ("/path/to/not_existent_path"
) to check that both of them doesn't raise an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_disk_usage_warning_with_non_existent_path(self) -> None: | |
# Test for not existent path | |
with warnings.catch_warnings(record=True) as w: | |
# Cause all warnings to always be triggered. | |
warnings.simplefilter("always") | |
_check_disk_space(expected_size=self.expected_size, target_dir="not_existent_path") | |
assert len(w) == 0 | |
# Test for relative path | |
with warnings.catch_warnings(record=True) as w: | |
# Cause all warnings to always be triggered. | |
warnings.simplefilter("always") | |
_check_disk_space(expected_size=self.expected_size, target_dir="./not_existent_path") | |
assert len(w) == 0 | |
def test_disk_usage_warning_with_non_existent_path(self) -> None: | |
# Test for not existent (absolute) path | |
with warnings.catch_warnings(record=True) as w: | |
# Cause all warnings to always be triggered. | |
warnings.simplefilter("always") | |
_check_disk_space(expected_size=self.expected_size, target_dir="path/to/not_existent_path") | |
assert len(w) == 0 | |
# Test for not existent (relative) path | |
with warnings.catch_warnings(record=True) as w: | |
# Cause all warnings to always be triggered. | |
warnings.simplefilter("always") | |
_check_disk_space(expected_size=self.expected_size, target_dir="/path/to/not_existent_path") | |
assert len(w) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the changes @martinbrose! Test looks good. I've updated it to have both an absolute and a relative paths but that's it. We are good to merge now :)
Codecov ReportAll modified lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1692 +/- ##
==========================================
- Coverage 82.41% 81.96% -0.45%
==========================================
Files 62 62
Lines 7194 7198 +4
==========================================
- Hits 5929 5900 -29
- Misses 1265 1298 +33
☔ View full report in Codecov by Sentry. |
* Add realpath to disk_usage to allow path-like obj * Add logic to address non-existent paths * Add test cased for non-existent paths * Update tests/test_file_download.py --------- Co-authored-by: Lucain <lucainp@gmail.com>
Address issue #1690.
_check_disk_space
fails when only provided with incomplete paths likecheckpoints/stabilityai/stable-diffusion-xl-base-1.0
and then tries to asses exactly this "path".It seems to be a regular case when
snapshot_download
and hencehf_hub_download
is being called without the parameterlocal_dir
.snapshot_download.py
(248:250) only performsos.path.realpath()
whenlocal_dir
is not None:My change in this PR now addresses this case and ensures that the real complete path is being used in
shutil.disk_usage().free
within_check_disk_space()
.