Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-34722: Consistent serialization of sets in bytecode #9472

Closed
wants to merge 3 commits into from

Conversation

peterebden
Copy link

@peterebden peterebden commented Sep 21, 2018

This ensures that sets / frozensets marshal in a consistent order by sorting the serialised items before writing them. That will obviously make serialisation of such types somewhat slower.

Added a test for the case in question; it needs to use compileall as a subprocess in order to test with different hash seeds.

https://bugs.python.org/issue34722

Py_DECREF(value);
for (i = 0; i < n; i++) {
value = PyList_GetItem(l, i);
value = PyMarshal_WriteObjectToString(value, Py_MARSHAL_VERSION);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will lost the identity of objects. For example, in {(o, 1), (o, 2)} you will get different objects after marshalling/unmarshalling.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I can get the same objects back at present?

    def test_object_identity(self):
        o = 'test'
        obj = {(o, 1), (o, 2)}
        data = marshal.dumps(obj)
        v = marshal.loads(data)
        ids_before = {id(x) for x in obj}
        ids_after = {id(x) for x in marshal.loads(data)}
        self.assertEqual(ids_before, ids_after)

AssertionError: Items in the first set but not the second:
139864671036808
139864645838728

Maybe I'm misunderstanding what you mean, or that's not a good test case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional tests were added in #13736. Rebase your PR and make the tests be success.

w_object(value, p);
Py_DECREF(value);
for (i = 0; i < n; i++) {
value = PyList_GetItem(l, i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the PyList_GetItem() and PyList_SetItem() calls added in this patch should be replaced with the PyList_GET_ITEM() and PyList_SET_ITEM() macros.

Also, PyMarshal_WriteObjectToString() should be checked for failure.

@tiran tiran removed their request for review April 17, 2021 21:03
@methane
Copy link
Member

methane commented May 4, 2022

Close this because #27926 was merged.

@methane methane closed this May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants