Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-78274: Produce consistent output from marshal.dumps(). #28379

Conversation

ericsnowcurrently
Copy link
Member

@ericsnowcurrently ericsnowcurrently commented Sep 16, 2021

The result on debug vs. non-debug builds were slightly different. This PR fixes that by making an additional pass over the object to ensure "refs" are used more than once. Consequently, marshal.dumps() is substantially slower. To mitigate this, I've added to it a keyword-only arg, "stable", to allow folks to opt out.

FYI, I discovered gh-8293 after the fact, which turns out to have a lot of similarities to this PR.

@ericsnowcurrently
Copy link
Member Author

@gvanrossum, is something like this what you had in mind?

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, but I'm probably not going to be able to review it before Monday.

@ericsnowcurrently ericsnowcurrently removed the request for review from tiran September 16, 2021 22:06
@ericsnowcurrently ericsnowcurrently changed the title bpo-45186: Produce consistent output from marshal.dumps(). bpo-34093: Produce consistent output from marshal.dumps(). Sep 16, 2021
@@ -1789,6 +1837,8 @@ marshal.dumps
version: int(c_default="Py_MARSHAL_VERSION") = version
Indicates the data format that dumps should use.
/
*
stable: bool = True
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably make sense to default to False for now, so users would effectively opt-in to the performance penalty. (The same goes for marshal.dump().)

@ericsnowcurrently
Copy link
Member Author

ericsnowcurrently commented Sep 20, 2021

As I noted in [bpo-34093](https://bugs.python.org/issue34093#msg402244), we could take a different approach to get the same stable output without having to call w_object() twice. However, this PR would probably be worth landing in the meantime.

@ericsnowcurrently
Copy link
Member Author

This PR should probably also add a flag to determine if the import machinery would use marshal.dumps(code, stable=True). The same goes for the compileall module.

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Oct 21, 2021
@gvanrossum
Copy link
Member

gvanrossum commented Oct 21, 2021 via email

@methane
Copy link
Member

methane commented Oct 21, 2021

We need to this to support "reproducible build" in downstream.

@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Oct 22, 2021
Copy link
Member

@iritkatriel iritkatriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has merge conflicts now.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@gvanrossum gvanrossum changed the title bpo-34093: Produce consistent output from marshal.dumps(). gh-78274: Produce consistent output from marshal.dumps(). Nov 29, 2022
@encukou
Copy link
Member

encukou commented Mar 26, 2024

Closing; see discussion in the issue.

@encukou encukou closed this Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants