-
Notifications
You must be signed in to change notification settings - Fork 801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug/_download_nltk_packages_if_not_present throws HTTP 403 Forbidden #3795
Comments
Also experiencing this issue ! |
I still have the issue now. Where can I download and install it? |
Hi @tn-halfspace, were you able to resolve this issue? |
This is ridiculous @Unstructured-DevOps. It hapenned again today. I've created my own workaround to use nltk's downloader directly instead of relying on unstructured.nlp downloader and download the two packages that are used by unstructured.io ( I think in a perfect world the best practice would be to ship your code with these two packages and use manual installation by changing the NLTK_PATH according to NLTK documentation |
If you upgrade to the latest, it should be fixed: #3796 |
Following could be a hack for solving this issue. import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger') |
yeah, that's what I am doing right now (iterative/datachain#687). :) |
This started happening for us today as well. The previous version used to call this s3 url to download nltk https://utic-public-cf.s3.amazonaws.com/nltk_data_3.8.2.tar.gz however it now just throws a 403. |
Same here, we're also seeing the 403 errors. |
I'm experiencing this intermittently as well. |
This is still happening and has broken all our pipelines |
Same here |
Updating to 0.16.11 seems to have addressed the issue for me. |
Reinstalled everything and reupdated, its working now with the latest version so far, however I have other issues now where the outputs have changed :) but that's for another day (issue). Thanks for all the suggestions in this thread! |
Can confirm upgrading to the latest version of unstructured (0.16.11) resolved the bug for me. |
Bump unstructured to pick up resolution of Unstructured-IO/unstructured#3795
Closing as resolved. If you're still having trouble feel free to reopen. |
Describe the bug
Until 25.11.2024 I haven't seen any problems with this function. Since now this url:
https://utic-public-cf.s3.amazonaws.com/nltk_data_3.8.2.tar.gz
Returned 403.
Edit: it's back up there... But seems inconsistent.
The text was updated successfully, but these errors were encountered: