-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disallow pickle.load
unless TRUST_REMOTE_CODE=True
#27776
Conversation
@@ -288,33 +256,6 @@ def test_custom_hf_index_retriever_save_and_from_pretrained_from_disk(self): | |||
out = retriever.retrieve(hidden_states, n_docs=1) | |||
self.assertTrue(out is not None) | |||
|
|||
def test_legacy_index_retriever_retrieve(self): | |||
n_docs = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to test the "legacy" case anymore after (not merged yet) #27748
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use TRUST_REMOTE_CODE
as the var name, will extend to the other usecase and paves the way for a feature that was asked a lot
Not really. |
Happy the give another name if we come up with a better one though! |
Thanks for the implementation @ydshieh! An environment variable for remote code has also been asked in the past, and given the proximity to this I think it would make sense to have the same variable name for the two.
If you're strongly against Would that be ok for you? |
I am fine with the name, but I think we might adjust the documentation a bit for
And this sounds like I will update the PR. |
Updated. It turns out that I don't need to update the message - it's clear enough 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 🫡
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! and i agree that using the same env var is a good idea / makes sense 👍 |
Sorry about that, but there is a tiny issue with the condition Thank you for all the review. |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
pickle.load
by defaultpickle.load
unless TRUST_REMOTE_CODE=True
This seems to have now triggered a dependabot update in downstream repos: https://github.com/huggingface/api-inference-community/security/dependabot/78 The concerned userbase is extremely limited and all we did was protect from pickle loading so really unsure about the "critical" report, but good to have done it nonetheless. Thanks Yih-Dar! |
I show scan dependabot in all the dependency GitHub repositories to show my impact 😆 |
…7776) * fix * fix * Use TRUST_REMOTE_CODE * fix doc * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
What does this PR do?
For security reason, require the users to explicitly allow access to the places where
pickle.load
is used.Before merge
There are a few
pickle.load
inexamples/research_projects/
, some conversion files and test files. These are considered less severe, but we can still apply the change to all of them to get rid of this unlovely pickle stuff.