Secure persistence: add extensibility #141

BenjaminBossan · 2022-09-16T16:09:09Z

A mechanism should be implemented that allows library authors to add secure persistence to their library, and for users to opt those libraries in.

adrinjalali · 2022-09-27T06:50:24Z

Thinking more about this, what do you think of supporting only __getstate__/__setstate__, and allowing certain types as we do now, like lists, dicts, function, etc.

This would mean we don't need to do much on our side, and they only need to implement a sensible __getstate__/__setstate__ pair.

If we manage to move all sklearn models to the same path, then we can completely remove __reduce__.

BenjaminBossan · 2022-09-27T07:50:21Z

That sounds good, we can start with __getstate__ & __setstate__ and hopefully that's good enough. If it isn't, we can later think about providing more. If we can eventually stop supporting __reduce__, it should really lower our burden as well.

Still, the user should have to opt in explicitly. Do you already have an API in mind?

adrinjalali · 2022-09-27T09:59:58Z

I'm not sure what you mean by user in this context. I'm thinking the save would automatically work if the object of third party libraries provide these methods, and during load, we'd fail if we don't know the library, and user can say they trust the source kinda thing.

I'm still not sure if we should load simple objects if the user has the corresponding library installed or not, if they have it installed, they trust it?

BenjaminBossan · 2022-09-27T11:41:22Z

and during load, we'd fail if we don't know the library, and user can say they trust the source kinda thing.

That's what I meant. How would the API for this look like?

I'm still not sure if we should load simple objects if the user has the corresponding library installed or not, if they have it installed, they trust it?

Previously, I would have said that if you installed a malicious library, you're probably already compromised, so there is nothing to be gained from not trusting it at this point. But IIUC, Python is moving in a direction of not allowing arbitrary code to be run during installation (avoiding setup.py), so just installing a malicious package wouldn't mean you're compromised. But for the time being, there is of course still a lot of code that relies on setup.py that needs to be supported and I'm not sure if that's planned to be deprecated. (This is all from memory, maybe I get some details wrong.)

adrinjalali · 2022-09-29T13:12:18Z

Packages can run arbitrary code through .pth files. We don't need setup.py to run arbitrary code :D

discussions about removing .pth files: https://bugs.python.org/issue33944
rejected PEP about an alternative mechanism: https://peps.python.org/pep-0648/

While saving models, we save everything, it's only while loading that security issues come in play. Therefore I think simply relying on __getstate__/__setstate__ is enough, and there doesn't need to be an extra user facing API for it.

BenjaminBossan · 2022-09-29T13:49:57Z

Packages can run arbitrary code through .pth files. We don't need setup.py to run arbitrary code :D

Oh man, it never stops, does it? Okay, so we can basically assume that if a malicious package is installed, the user is already compromised.

While saving models, we save everything, it's only while loading that security issues come in play. Therefore I think simply relying on __getstate__/__setstate__ is enough, and there doesn't need to be an extra user facing API for it.

Okay, let's do it this way and hopefully it'll be sufficient.

BenjaminBossan added the persistence Secure persistence feature label Sep 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Secure persistence: add extensibility #141

Secure persistence: add extensibility #141

BenjaminBossan commented Sep 16, 2022

adrinjalali commented Sep 27, 2022

BenjaminBossan commented Sep 27, 2022 •

edited

Loading

adrinjalali commented Sep 27, 2022

BenjaminBossan commented Sep 27, 2022

adrinjalali commented Sep 29, 2022

BenjaminBossan commented Sep 29, 2022

Secure persistence: add extensibility #141

Secure persistence: add extensibility #141

Comments

BenjaminBossan commented Sep 16, 2022

adrinjalali commented Sep 27, 2022

BenjaminBossan commented Sep 27, 2022 • edited Loading

adrinjalali commented Sep 27, 2022

BenjaminBossan commented Sep 27, 2022

adrinjalali commented Sep 29, 2022

BenjaminBossan commented Sep 29, 2022

BenjaminBossan commented Sep 27, 2022 •

edited

Loading