Skip to content

Conversation

@methane
Copy link
Member

@methane methane commented Jul 10, 2018

marshal.dumps() tests refcnt(obj)==1 to decide use FLAG_REF or not.

But refcnt of interned string is very unstable.
When compiling same source, refcnt of interned string in the output
may be 1 or >1. It makes FLAG_REF usage unstable.

To help reproducible build, use FLAG_REF for interned string even if
refcnt(obj)==1.

https://bugs.python.org/issue34033

@methane methane added type-feature A feature request or enhancement awaiting core review DO-NOT-MERGE and removed awaiting merge labels Jul 10, 2018
@methane methane changed the title bpo-34033: marshal: Use FLAG_REF stably bpo-34093: marshal: Use FLAG_REF stably Jul 11, 2018
@serhiy-storchaka serhiy-storchaka self-requested a review July 12, 2018 05:45
@serhiy-storchaka
Copy link
Member

I'm in process of reviewing.

@methane methane changed the title bpo-34093: marshal: Use FLAG_REF stably bpo-34093: marshal: Stabilize FLAG_REF usage Jul 14, 2018
@serhiy-storchaka
Copy link
Member

Do you mind to provide also your initial PR? I want to try to optimize or simplify it.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't using FLAG_REF for all interned strings slows down unmarshalling and increases the memory consumption for both marshalling and unmarshalling?

@methane
Copy link
Member Author

methane commented Jul 16, 2018

Isn't using FLAG_REF for all interned strings slows down unmarshalling and increases the memory consumption for both marshalling and unmarshalling?

For unmarshaling speed, it's possible. But interned string has cost of interning; create temoporary string and calling PyDict_SetDefault(). Overhead of FLAG_REF (PyList_Append) is much smaller than it.

For marshaling memory overhead, it will increase hashtable size for each interned string with refcnt==1.
For unmarshaling memory overhead, it will increase list length too.

I don't expect it can be pragmatic problem though.

@mcepl
Copy link
Contributor

mcepl commented Jun 16, 2020

@bmwiedemann Do you know about this one?

@bmwiedemann
Copy link
Contributor

I might have seen this patch 2 years ago. Chances are, I didnt try it, because it did not apply cleanly to the python we had in Tumbleweed.

@mcepl
Copy link
Contributor

mcepl commented Aug 6, 2020

@bmwiedemann Yes, you are right, this is basically non-applicable (especially the monstrosity in these two conflicting importlib*.h files).

@methane Do you think, you would be willing and able to port this PR to something more recent (e.g., 3.8 or master), please?

@methane
Copy link
Member Author

methane commented Aug 7, 2020

importlib.h and importlib_external.h are generated files. You can apply 6c8ea7c cleanly and "make regen-importlib".

@mcepl
Copy link
Contributor

mcepl commented Aug 7, 2020

@bmwiedemann
Copy link
Contributor

@mcepl I tried my multibuildrbk script for a python38-base build there and got:

--- old /usr/lib64/python3.8/test/test_json/__pycache__/test_scanstring.cpython-38.pyc (hex)
+++ new /usr/lib64/python3.8/test/test_json/__pycache__/test_scanstring.cpython-38.pyc (hex)
@@ -1,6 +1,6 @@
 000002c0  00 00 54 29 02 f5 06 00  00 00 7a f0 9d 84 a0 78  |..T)......z....x|
 000002d0  e9 05 00 00 00 7a 08 22  5c 75 30 30 37 62 22 29  |.....z."\u007b")|
-000002e0  02 fa 01 7b e9 08 00 00  00 7a 3c 22 41 20 4a 53  |...{.....z<"A JS|
+000002e0  02 da 01 7b e9 08 00 00  00 7a 3c 22 41 20 4a 53  |...{.....z<"A JS|
 000002f0  4f 4e 20 70 61 79 6c 6f  61 64 20 73 68 6f 75 6c  |ON payload shoul|
 00000300  64 20 62 65 20 61 6e 20  6f 62 6a 65 63 74 20 6f  |d be an object o|
 00000310  72 20 61 72 72 61 79 2c  20 6e 6f 74 20 61 20 73  |r array, not a s|

--- old /usr/lib64/python3.8/__pycache__/netrc.cpython-38.pyc (hex)
+++ new /usr/lib64/python3.8/__pycache__/netrc.cpython-38.pyc (hex)
@@ -1,7 +1,7 @@
 000007c0  63 64 65 66 7a 02 20 09  da 01 0a 7a 04 20 09 0d  |cdefz. ....z. ..|
 000007d0  0a 7a 15 62 61 64 20 74  6f 70 6c 65 76 65 6c 20  |.z.bad toplevel |
 000007e0  74 6f 6b 65 6e 20 25 72  3e 04 00 00 00 72 1e 00  |token %r>....r..|
-000007f0  00 00 72 21 00 00 00 72  20 00 00 00 72 22 00 00  |..r!...r ...r"..|
+000007f0  00 00 72 22 00 00 00 72  20 00 00 00 72 21 00 00  |..r"...r ...r!..|
 00000800  00 7a 26 6d 61 6c 66 6f  72 6d 65 64 20 25 73 20  |.z&malformed %s |
 00000810  65 6e 74 72 79 20 25 73  20 74 65 72 6d 69 6e 61  |entry %s termina|
 00000820  74 65 64 20 62 79 20 25  73 da 05 6c 6f 67 69 6e  |ted by %s..login|

--- old /usr/lib64/python3.8/__pycache__/pathlib.cpython-38.pyc (hex)
+++ new /usr/lib64/python3.8/__pycache__/pathlib.cpython-38.pyc (hex)
@@ -1,5 +1,5 @@
 000079c0  64 20 27 2e 2e 27 2e 0a  20 20 20 20 20 20 20 20  |d '..'..        |
-000079d0  3e 02 00 00 00 72 32 00  00 00 72 a9 00 00 00 4e  |>....r2...r....N|
+000079d0  3e 02 00 00 00 72 a9 00  00 00 72 32 00 00 00 4e  |>....r....r2...N|
 000079e0  29 05 72 64 01 00 00 72  69 01 00 00 72 b5 00 00  |).rd...ri...r...|
 000079f0  00 72 d5 00 00 00 72 f7  00 00 00 72 44 01 00 00  |.r....r....rD...|
 00007a00  72 22 00 00 00 72 22 00  00 00 72 23 00 00 00 da  |r"...r"...r#....|

The last 2 could be a different issue of ordering.

@methane methane requested a review from tiran as a code owner May 3, 2022 09:15
@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented May 3, 2022

We started freezing a bunch of stdlib modules last year, which definitely impacts the story here.

Regardless, treating all interned strings as "ref" objects makes sense, so +1 from me.

Copy link
Member

@ericsnowcurrently ericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@methane methane changed the title bpo-34093: marshal: Stabilize FLAG_REF usage gh-78214: marshal: Stabilize FLAG_REF usage May 4, 2022
@methane methane merged commit 6dcfd6c into python:main May 4, 2022
@methane methane deleted the stable-marshal branch May 4, 2022 01:01
@methane
Copy link
Member Author

methane commented May 4, 2022

This is temporal fix and will be reverted after more complete fix is landed.

Millak pushed a commit to Millak/guix that referenced this pull request Jan 22, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request May 7, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request Jun 9, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request Jun 9, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request Jun 11, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request Jun 17, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request Jun 27, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request Jun 29, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request Jul 11, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
Z572 pushed a commit to Z572/guix that referenced this pull request Jul 22, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
Z572 pushed a commit to Z572/guix that referenced this pull request Jul 23, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request Aug 10, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request Aug 28, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actions bot pushed a commit to guix-ru/guix that referenced this pull request Aug 31, 2024
While Python build was reproducible on a single machine, once multiple file
systems entered the picture, it was no longer true.  The solution adopted by
the upstream (and Debian) was cherry-picked.

More info: <python/cpython#8226>.

* gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
SomberNight added a commit to SomberNight/python-for-android that referenced this pull request Oct 15, 2024
…ation

patches from:
python/cpython#27926
python/cpython#8226

Both of these are included in python 3.11 upstream.
f321x pushed a commit to f321x/spesmilo-python-for-android-fork-fork that referenced this pull request Aug 25, 2025
…ation

patches from:
python/cpython#27926
python/cpython#8226

Both of these are included in python 3.11 upstream.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type-feature A feature request or enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants