-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-78214: marshal: Stabilize FLAG_REF usage #8226
Conversation
I'm in process of reviewing. |
Do you mind to provide also your initial PR? I want to try to optimize or simplify it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't using FLAG_REF for all interned strings slows down unmarshalling and increases the memory consumption for both marshalling and unmarshalling?
For unmarshaling speed, it's possible. But interned string has cost of interning; create temoporary string and calling PyDict_SetDefault(). Overhead of FLAG_REF (PyList_Append) is much smaller than it. For marshaling memory overhead, it will increase hashtable size for each interned string with refcnt==1. I don't expect it can be pragmatic problem though. |
@bmwiedemann Do you know about this one? |
I might have seen this patch 2 years ago. Chances are, I didnt try it, because it did not apply cleanly to the python we had in Tumbleweed. |
@bmwiedemann Yes, you are right, this is basically non-applicable (especially the monstrosity in these two conflicting @methane Do you think, you would be willing and able to port this PR to something more recent (e.g., 3.8 or master), please? |
importlib.h and importlib_external.h are generated files. You can apply 6c8ea7c cleanly and "make regen-importlib". |
So, @bmwiedemann what do you think about https://build.opensuse.org/package/show/home:mcepl:branches:devel:languages:python:Factory/python38 ? |
@mcepl I tried my --- old /usr/lib64/python3.8/test/test_json/__pycache__/test_scanstring.cpython-38.pyc (hex)
+++ new /usr/lib64/python3.8/test/test_json/__pycache__/test_scanstring.cpython-38.pyc (hex)
@@ -1,6 +1,6 @@
000002c0 00 00 54 29 02 f5 06 00 00 00 7a f0 9d 84 a0 78 |..T)......z....x|
000002d0 e9 05 00 00 00 7a 08 22 5c 75 30 30 37 62 22 29 |.....z."\u007b")|
-000002e0 02 fa 01 7b e9 08 00 00 00 7a 3c 22 41 20 4a 53 |...{.....z<"A JS|
+000002e0 02 da 01 7b e9 08 00 00 00 7a 3c 22 41 20 4a 53 |...{.....z<"A JS|
000002f0 4f 4e 20 70 61 79 6c 6f 61 64 20 73 68 6f 75 6c |ON payload shoul|
00000300 64 20 62 65 20 61 6e 20 6f 62 6a 65 63 74 20 6f |d be an object o|
00000310 72 20 61 72 72 61 79 2c 20 6e 6f 74 20 61 20 73 |r array, not a s|
--- old /usr/lib64/python3.8/__pycache__/netrc.cpython-38.pyc (hex)
+++ new /usr/lib64/python3.8/__pycache__/netrc.cpython-38.pyc (hex)
@@ -1,7 +1,7 @@
000007c0 63 64 65 66 7a 02 20 09 da 01 0a 7a 04 20 09 0d |cdefz. ....z. ..|
000007d0 0a 7a 15 62 61 64 20 74 6f 70 6c 65 76 65 6c 20 |.z.bad toplevel |
000007e0 74 6f 6b 65 6e 20 25 72 3e 04 00 00 00 72 1e 00 |token %r>....r..|
-000007f0 00 00 72 21 00 00 00 72 20 00 00 00 72 22 00 00 |..r!...r ...r"..|
+000007f0 00 00 72 22 00 00 00 72 20 00 00 00 72 21 00 00 |..r"...r ...r!..|
00000800 00 7a 26 6d 61 6c 66 6f 72 6d 65 64 20 25 73 20 |.z&malformed %s |
00000810 65 6e 74 72 79 20 25 73 20 74 65 72 6d 69 6e 61 |entry %s termina|
00000820 74 65 64 20 62 79 20 25 73 da 05 6c 6f 67 69 6e |ted by %s..login|
--- old /usr/lib64/python3.8/__pycache__/pathlib.cpython-38.pyc (hex)
+++ new /usr/lib64/python3.8/__pycache__/pathlib.cpython-38.pyc (hex)
@@ -1,5 +1,5 @@
000079c0 64 20 27 2e 2e 27 2e 0a 20 20 20 20 20 20 20 20 |d '..'.. |
-000079d0 3e 02 00 00 00 72 32 00 00 00 72 a9 00 00 00 4e |>....r2...r....N|
+000079d0 3e 02 00 00 00 72 a9 00 00 00 72 32 00 00 00 4e |>....r....r2...N|
000079e0 29 05 72 64 01 00 00 72 69 01 00 00 72 b5 00 00 |).rd...ri...r...|
000079f0 00 72 d5 00 00 00 72 f7 00 00 00 72 44 01 00 00 |.r....r....rD...|
00007a00 72 22 00 00 00 72 22 00 00 00 72 23 00 00 00 da |r"...r"...r#....| The last 2 could be a different issue of ordering. |
Regardless, treating all interned strings as "ref" objects makes sense, so +1 from me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This is temporal fix and will be reverted after more complete fix is landed. |
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
…ation patches from: python/cpython#27926 python/cpython#8226 Both of these are included in python 3.11 upstream.
marshal.dumps() tests
refcnt(obj)==1
to decide use FLAG_REF or not.But refcnt of interned string is very unstable.
When compiling same source, refcnt of interned string in the output
may be 1 or >1. It makes FLAG_REF usage unstable.
To help reproducible build, use FLAG_REF for interned string even if
refcnt(obj)==1.
https://bugs.python.org/issue34033