Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python variables called _m lead to unreproducible pyc installations #92132

Closed
josch opened this issue May 2, 2022 · 4 comments
Closed

python variables called _m lead to unreproducible pyc installations #92132

josch opened this issue May 2, 2022 · 4 comments

Comments

@josch
Copy link

josch commented May 2, 2022

Hi,

in Debian we backported the changes from #27926 to cpython 3.10 and indeed most pyc files that we generate when installing python packages are now bit-by-bit reproducible. There is only a single pyc file that still sometimes leads to a different pyc file after installation: /usr/lib/python3.10/json/__pycache__/decoder.cpython-310.pyc. I tracked down the reason to its use of the variable name _m and filed this as Debian bug 1010368. More specifically, that pyc file will have one content in 1/3 cases and another in 2/3 of the cases. The diff between the pyc files is:

@@ -1,8 +1,8 @@
 00000000: 6f0d 0d0a 0300 0000 5371 fe33 17b6 dd59  o.......Sq.3...Y
 00000010: e300 0000 0000 0000 0000 0000 0000 0000  ................
 00000020: 0001 0000 0040 0000 0073 0800 0000 6500  .....@...s....e.
-00000030: 0100 6400 5300 2901 4e29 01da 025f 6da9  ..d.S.).N)..._m.
-00000040: 0072 0200 0000 7202 0000 00fa 0f2f 746d  .r....r....../tm
+00000030: 0100 6400 5300 2901 4e29 015a 025f 6da9  ..d.S.).N).Z._m.
+00000040: 0072 0100 0000 7201 0000 00fa 0f2f 746d  .r....r....../tm
 00000050: 702f 6465 636f 6465 722e 7079 da08 3c6d  p/decoder.py..<m
 00000060: 6f64 756c 653e 0100 0000 7302 0000 0008  odule>....s.....
 00000070: 00

I can make this problem trigger on a different variable name than _m via the following patch:

--- a/Lib/types.py
+++ b/Lib/types.py
@@ -37,8 +37,8 @@ _ag = _ag()
 AsyncGeneratorType = type(_ag)
 
 class _C:
-    def _m(self): pass
-MethodType = type(_C()._m)
+    def _b(self): pass
+MethodType = type(_C()._b)
 
 BuiltinFunctionType = type(len)
 BuiltinMethodType = type([].append)     # Same as BuiltinFunctionType

With that patch, python files containing the variable name _b are now sometimes unreproducible.

I don't seem to be the first who stumbled across the _m variable: #78903 (comment)

In the Debian bug I referenced above, Chris Lamb states, that there is no semantic difference between the different pyc files and Keith Amling points out, that the difference is FLAG_REF being added or not.

Since pyc files containing the variable _m are only sometimes unreproducible on new python installations but stable on the same installation, this might as well be a python packaging bug in Debian or maybe we need to backport more than just #27926 to 3.10 to make pyc files stable across different installations. I thus wanted to ask you for any input or advice you can give on this issue.

Thanks!

@josch josch added the type-bug An unexpected behavior, bug, or error label May 2, 2022
@methane
Copy link
Member

methane commented May 2, 2022

See #8226 and #8293

@methane
Copy link
Member

methane commented May 2, 2022

And this issue is duplicate of #78214

@josch
Copy link
Author

josch commented May 2, 2022

Thanks! I backported the changes to Python/marshal.c from #8226 to python 3.10 in Debian and can confirm that those changes fix this particular problem.

@methane do you think that #8226 has a chance to be merged soon?

@methane
Copy link
Member

methane commented May 4, 2022

I merged #8226 for Python 3.11.
But #28379 is more robust fix for longer term.

Anyway, I close this issue because this is duplicate of #78274

@methane methane closed this as completed May 4, 2022
@methane methane removed the type-bug An unexpected behavior, bug, or error label May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants