Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-96735: Fix undefined behaviour in struct unpacking functions #96739

Merged

Conversation

mdickinson
Copy link
Member

@mdickinson mdickinson commented Sep 10, 2022

This PR fixes undefined behaviour in the struct module unpacking support functions bu_longlong, lu_longlong, bu_int and lu_int; thanks to @kumaraditya303 for finding these.

The fix is to accumulate the bytes in an unsigned integer type instead of a signed integer type, then to convert to the appropriate signed type. In cases where the width matches, that conversion will typically be compiled away to a no-op.
(Evidence from Godbolt: https://godbolt.org/z/5zvxodj64 .)

To make the conversions efficient, I've specialised the relevant functions for their output size: for bu_longlong and lu_longlong, this only entails checking that the output size is indeed 8. But bu_int and lu_int were used for format sizes 2 and 4 - I've split those into two separate functions each.

No tests, because all of the affected cases are already exercised by the test suite.

@mdickinson

This comment was marked as outdated.

@mdickinson mdickinson changed the title gh-96735: Fix undefined behaviour in bu_ulonglong gh-96735: Fix undefined behaviour in struct unpacking functions Sep 10, 2022
@mdickinson
Copy link
Member Author

It's very tempting to convert all of these functions to use fixed-width integer types (uint16, uint32, uint64), but that seems like too invasive a change for a bugfix PR. It's something we could possibly do for 3.12 only.

Copy link
Contributor

@kumaraditya303 kumaraditya303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I tested this and indeed it fixes the UB.

Thanks for fixing the other ones too, just noticed that those were also affected.

@kumaraditya303
Copy link
Contributor

It's very tempting to convert all of these functions to use fixed-width integer types (uint16, uint32, uint64), but that seems like too invasive a change for a bugfix PR. It's something we could possibly do for 3.12 only.

I agree, it would be better but best left for 3.12 only.

@mdickinson mdickinson added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 13, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @mdickinson for commit 1294440 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 13, 2022
@mdickinson
Copy link
Member Author

@kumaraditya303 I've updated this PR to be more efficient, while still avoiding undefined behaviour and implementation-defined behaviour (specifically, the conversion from an unsigned type to the corresponding signed type when the value being converted is not representable in the signed type; cf. C99 §6.3.1.3p3).

The sign-extension is now branch free (and should compile to a no-op in the case that the C type being used has exactly the correct width), and the conversion from unsigned to signed should always compile to a no-op on any semi-reasonable compiler.

Godbolt example for the case where sign extension is needed: https://godbolt.org/z/e8roKrn6r
Godbolt example for the case where no sign extension is needed: https://godbolt.org/z/bPhbsT9Kn

@matthiasgoergens
Copy link
Contributor

matthiasgoergens commented Sep 14, 2022

We are currently compiling all non-pydebug builds with -fwrapv (at least on gcc and clang). That means the shenanigans going on in _struct.c are actually perfectly defined behaviour.

I don't think we will turn off -fwrapv for 3.11. So there's no need for a stopgap solution.

I'd say: just do the Right Thing for 3.12 in main, and don't worry about 3.11. Especially if the Right Thing is very tempting to do anyway.

See also #96821 and #96678 (comment)

As a minimal change for 3.11, I would suggest enabling -fwrapv even when giving --with-pydebug.

That being said, I'm not opposed to this PR. It's a great start to getting everything safe for -fstrict-overflow.

@mdickinson
Copy link
Member Author

I'd say: just do the Right Thing for 3.12 in main, and don't worry about 3.11. Especially if the Right Thing is very tempting to do anyway.

Agreed; I think I'll merge this for 3.12 only.

@mdickinson mdickinson removed the needs backport to 3.10 only security fixes label Sep 25, 2022
@mdickinson mdickinson removed the needs backport to 3.11 only security fixes label Sep 25, 2022
@mdickinson
Copy link
Member Author

Some not very rigorous timings, on macOS/Intel, non-optimised non-debug build. (This PR is not primarily about performance, but it would be unfortunate if it caused a significant performance regression.)

On this branch:

mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<hhh'); v=b'\x1b\xe0\xcfKSh'" "s.unpack(v)"
5000000 loops, best of 5: 92.6 nsec per loop
mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>hhh'); v=b'\x1b\xe0\xcfKSh'" "s.unpack(v)"
5000000 loops, best of 5: 94.3 nsec per loop
mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<iii'); v=b'\x1b\xe0\xcfKSh'*2" "s.unpack(v)"
5000000 loops, best of 5: 99.1 nsec per loop
mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>iii'); v=b'\x1b\xe0\xcfKSh'*2" "s.unpack(v)"
5000000 loops, best of 5: 98.3 nsec per loop
mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<qqq'); v=b'\x1b\xe0\xcfKSh'*4" "s.unpack(v)"
2000000 loops, best of 5: 99.1 nsec per loop
mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>qqq'); v=b'\x1b\xe0\xcfKSh'*4" "s.unpack(v)"
2000000 loops, best of 5: 102 nsec per loop

On the main branch at commit 6281aff:

mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<hhh'); v=b'\x1b\xe0\xcfKSh'" "s.unpack(v)"  
5000000 loops, best of 5: 94.9 nsec per loop
mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>hhh'); v=b'\x1b\xe0\xcfKSh'" "s.unpack(v)"  
5000000 loops, best of 5: 98.2 nsec per loop
mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<iii'); v=b'\x1b\xe0\xcfKSh'*2" "s.unpack(v)"
2000000 loops, best of 5: 99.1 nsec per loop
mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>iii'); v=b'\x1b\xe0\xcfKSh'*2" "s.unpack(v)"
2000000 loops, best of 5: 104 nsec per loop
mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<qqq'); v=b'\x1b\xe0\xcfKSh'*4" "s.unpack(v)"
2000000 loops, best of 5: 101 nsec per loop
mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>qqq'); v=b'\x1b\xe0\xcfKSh'*4" "s.unpack(v)"
2000000 loops, best of 5: 112 nsec per loop

@mdickinson mdickinson merged commit f5f047a into python:main Sep 25, 2022
@mdickinson mdickinson deleted the fix-struct-unpack-undefined-behaviour branch September 25, 2022 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants