-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
distutils is not reproducible #78214
Comments
Follow up of bpo-29708: OpenSUSE uses a downstream patch for distutils to fix https://bugzilla.opensuse.org/show_bug.cgi?id=1049186: distutils-reproducible-compile.patch. I converted the patch as a PR: PR 8057. Naoki INADA wrote: I think we should use more deterministic way instead of refcnt. Maybe, count all constants in the module before marshal, like we did in compiling function for co_consts and co_names. Serhiy Storchaka added: On other side, if the problem is with reference counters in marshal, we can change the marshal module instead. |
Copy of https://bugzilla.opensuse.org/show_bug.cgi?id=1049186 first message: in python3-simplejson.rpm we get in python3-simplejson-test.rpm we get the opposite change and it seems to be related to filesystem ordering, since it built reproducibly |
I agree that we should fix the underlying issue (marshal) rather than papering over it by sorting. In fact, we should have a test that compiles a bunch of pycs in a random orders and sees if they're the same or not. |
Is this issue for only known marshal issue? |
We should probably discuss the marshal issue in the preëxisting bpo-31377. I'm not sure if "distutils is not reproducible" is a larger issue than "pyc compilation is not reproducible". This issue could be a meta issue for either. |
IMHO the order in which .pyc files are created on disk also matters. It changes the result of "os.listdir()": some application can rely on unsorted os.listdir(). sorted() seems simple and hardless compared to the benefit. |
OK, I created sub issue for pyc. |
unreproducible .pyc files are still one of the major headaches for my work on openSUSE reproducible builds. There is also one aspect where i586 builds end up with different .pyc files than x86_64 builds. And then we randomly chose one of them for our "noarch" python module packages and hope they work everywhere (including on arm and s390 architectures). So is someone working towards a concept that makes it is possible to create the same .pyc files anywhere? |
They are functionally identical, despite not being bit-by-bit identical.
No, it's a known issue no one is working on.
Maybe?
A solution will probably come with an unacceptable performance hit -- it's good to keep generating the .pyc files fast. Two options to overcome that come to mind:
|
Use FLAG_REF always for interned strings. Refcounts of interned string is very unstable. When compiling same source, refcounts of interned string in the output may be 1 or >1. It makes FLAG_REF usage unstable. To help reproducible build, use FLAG_REF for interned string even if refcnt(obj)==1.
Thank you @methane -- we are now carrying your patch in python 3.10 in Debian: https://sources.debian.org/src/python3.10/3.10.4-4/debian/patches/gh-78214.diff/ |
@vstinner should this be closed now or will there be any other patches to Distutils before its removal? A |
I don't know the status of this issue, you should ask @methane who is more involved in this topic. |
Ahh sorry, I will wait for @methane's opinion. A |
We have enough opening issues relating to reproducible pyc. So I agree to close this one. |
Dependencies:
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: