-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filename too long error #2756
Comments
Hi @adamkarvonen, sorry you encountered this issue. >>> len("autointerp_with_generations/saebench_pythia-160m-deduped_width-2pow12_date-0108/saebench_pythia-160m-deduped_width-2pow12_date-0108_StandardTrainerAprilUpdate_EleutherAI_pythia-160m-deduped_ctx1024_0108_resid_post_layer_8_trainer_10_custom_sae_eval_results.json")
261 The filepath alone exceeds the 255 character limit on most systems, even before adding the necessary cache management components. The best way to fix this would be to reduce the filepath length on your side. |
Thanks for the response. The file path itself isn't actually the issue - I can successfully download the file to this location using other download methods that don't create long temporary filenames. I get that the SHA256 hash is important for security, but going from a 20-character hash to a 64-character hash in this context probably only adds a tiny fraction of a percent in terms of collision resistance. I understand there are trade-offs here, but it might be worth considering adding an optional flag that lets users specify shorter temporary filenames when needed. While this is definitely a niche issue, it seems like there could be a simple fix that maintains security by default while giving users a way to work around these filename length problems. |
Hi @adamkarvonen I just opened #2789 to fix this. Note that this issue only happens because you are "downloading to a local folder". Downloading to the cache folder (default behavior) should already be working. I chose not to add a flag as I expect it to be very much unused (and therefore non-relevant to maintain). All incomplete paths will not be based on |
That looks like a great fix! Thanks for addressing this. |
I got this error when using
snapshot_download()
:OSError: [Errno 36] File name too long: 'pythia-4k-backup/.cache/huggingface/download/autointerp_with_generations/saebench_pythia-160m-deduped_width-2pow12_date-0108/saebench_pythia-160m-deduped_width-2pow12_date-0108_StandardTrainerAprilUpdate_EleutherAI_pythia-160m-deduped_ctx1024_0108_resid_post_layer_8_trainer_10_custom_sae_eval_results.json.81ca464035a619f824d841360a423337f239b86c07092010f6df8e64fef98b74.incomplete'
My original filename is around ~180 characters, and the limit on my OS was 255. huggingface-hub added ~80 characters to the intermediate filename, causing the Filename too long error. I think huggingface-hub would be fine with using only 20 or 30 characters for the hash and incomplete filename extension, which would limit filename length errors.
Possibly some logic could be added so the hash could be shortened if the filename length is > 255 characters.
The text was updated successfully, but these errors were encountered: