Skip to content

Conversation

tpambor
Copy link
Contributor

@tpambor tpambor commented Jun 2, 2025

The in_addr and in6_addr structures currently have a 4-byte alignment
requirement. However, throughout the network subsystem, these structures
are accessed using UNALIGNED_GET/UNALIGNED_PUT macros to safely handle
unaligned memory accesses. This is often necessary as IP addresses do not
always occur aligned in network packets.

This commit reduces the alignment requirement of in_addr and in6_addr
to 1 byte by marking them as packed, reflecting their actual usage.

The struct sockaddr has a minimum alignment requirement of 2 bytes.
It is commonly cast to more specific socket address types
such as sockaddr_in, sockaddr_ll, etc. To ensure safe casting and avoid
potential unaligned memory accesses, these derived structures must not
have a stricter alignment requirement than sockaddr.

This commit reduces the minimum alignment of sockaddr_ll from 4 bytes to
2 bytes, aligning it with sockaddr and preventing possible issues on
architectures that do not tolerate unaligned accesses.

In addition, compiler warnings are generated for potentially problematic
code like taking a pointer with a larger alignment requirement from one
of the struct members.

This change also resolves issues reported by the Undefined Behavior
Sanitizer (UBSAN), which flagged these unaligned accesses as undefined
behavior.

Errors reported by UBSAN fixed by this PR:

zephyr/include/zephyr/net/net_ip.h:1151:9: runtime error: member access within misaligned address 0x080b105e for type 'const struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:1151:9: runtime error: member access within misaligned address 0x080c869a for type 'const struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:1151:9: runtime error: member access within misaligned address 0x080e4106 for type 'const struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:1151:9: runtime error: member access within misaligned address 0x080e5646 for type 'const struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080a6a9e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080a961e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080aaffe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080aaffe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab21e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab21e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab21e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab23e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab25e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab2fe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab31e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab3fe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab43e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab47e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab47e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab47e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab4fe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab51e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab5fe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab61e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab67e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab67e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab67e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab6fe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab71e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab73e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab73e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab73e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab81e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab81e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080ab81e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080aba1e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080aba1e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080aba1e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abb7e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abb7e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abb7e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abbfe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abbfe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abbfe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abc5e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abc5e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abc5e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abc5e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abc5e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abc5e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abc5e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abc5e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abc5e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abd9e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abd9e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abd9e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abefe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abefe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abefe for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abf7e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abf7e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:685:9: runtime error: member access within misaligned address 0x080abf7e for type 'struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:700:22: runtime error: member access within misaligned address 0x080c8aca for type 'const struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:700:22: runtime error: member access within misaligned address 0x080c8aea for type 'const struct in6_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:817:22: runtime error: member access within misaligned address 0x080be4ce for type 'struct in_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:817:22: runtime error: member access within misaligned address 0x080cb8de for type 'struct in_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:930:9: runtime error: member access within misaligned address 0x0809e3ce for type 'const struct in_addr', which requires 4 byte alignment
zephyr/include/zephyr/net/net_ip.h:930:9: runtime error: member access within misaligned address 0x0809e3ce for type 'const struct in_addr', which requires 4 byte alignment
zephyr/subsys/net/ip/utils.c:179:5: runtime error: member access within misaligned address 0xf42fa232 for type 'struct in6_addr', which requires 4 byte alignment
zephyr/subsys/net/ip/utils.c:179:5: runtime error: member access within misaligned address 0xf42fa232 for type 'struct in6_addr', which requires 4 byte alignment
zephyr/subsys/net/ip/utils.c:179:5: runtime error: member access within misaligned address 0xf49fb232 for type 'struct in6_addr', which requires 4 byte alignment
zephyr/subsys/net/ip/utils.c:802:8: runtime error: member access within misaligned address 0xf69ff1c6 for type 'struct sockaddr_in6', which requires 4 byte alignment
zephyr/subsys/net/ip/utils.c:802:8: runtime error: member access within misaligned address 0xf6aff1c6 for type 'struct sockaddr_in6', which requires 4 byte alignment
zephyr/subsys/net/ip/utils.c:893:8: runtime error: member access within misaligned address 0xf4afb25a for type 'struct sockaddr_in', which requires 4 byte alignment
zephyr/subsys/net/l2/virtual/ipip/ipip.c:277:34: runtime error: member access within misaligned address 0x080d1a26 for type 'struct sockaddr_in', which requires 4 byte alignment
zephyr/subsys/net/l2/virtual/ipip/ipip.c:277:34: runtime error: member access within misaligned address 0x080d1a26 for type 'struct sockaddr_in', which requires 4 byte alignment
zephyr/subsys/net/l2/virtual/ipip/ipip.c:282:35: runtime error: member access within misaligned address 0x080d1a26 for type 'struct sockaddr_in6', which requires 4 byte alignment
zephyr/subsys/net/l2/virtual/ipip/ipip.c:282:35: runtime error: member access within misaligned address 0x080d1a26 for type 'struct sockaddr_in6', which requires 4 byte alignment
zephyr/subsys/tracing/ctf/ctf_top.c:436:9: runtime error: member access within misaligned address 0xf4afb2e2 for type 'struct sockaddr_in', which requires 4 byte alignment

This is a step towards fixing #90882.

@tpambor
Copy link
Contributor Author

tpambor commented Jun 2, 2025

This PR also fixes #88665

jukkar
jukkar previously approved these changes Jun 3, 2025
rlubos
rlubos previously approved these changes Jun 3, 2025
Copy link
Member

@carlescufi carlescufi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, throughout the network subsystem, these structures
are accessed using UNALIGNED_GET/UNALIGNED_PUT macros to safely handle
unaligned memory accesses. This is often necessary as IP addresses do not
always occur aligned in network packets.

Then why do you need to pack them then? is UBSAN complaining because the values are being accessed byte-by-byte?

@tpambor
Copy link
Contributor Author

tpambor commented Jun 4, 2025

However, throughout the network subsystem, these structures
are accessed using UNALIGNED_GET/UNALIGNED_PUT macros to safely handle
unaligned memory accesses. This is often necessary as IP addresses do not
always occur aligned in network packets.

Then why do you need to pack them then? is UBSAN complaining because the values are being accessed byte-by-byte?

The issue isn't with the dereferencing itself, UNALIGNED_GET and UNALIGNED_PUT handle that safely, but rather with the type of pointer being accessed. Without this PR, a pointer of type struct in_addr * is expected to be aligned to a 4-byte boundary, which is the natural alignment for that type. If the pointer is not properly aligned, UBSAN gives an error when any member of the struct is accessed (e.g. addr->s_addr), reporting a "member access within misaligned address."

By applying the packed attribute to the struct, we explicitly lower its alignment requirement to 1 byte. This ensures that even if the pointer is not 4-byte aligned, accessing its members no longer triggers a UBSAN error.

ghost
ghost previously approved these changes Jun 4, 2025
@tpambor tpambor requested a review from carlescufi June 10, 2025 08:08
@ghost
Copy link

ghost commented Jun 12, 2025

@carlescufi Is @tpambor s answer satisfactory to you? Would you like some additional changes? Do you have an alternative suggestion?

@ghost
Copy link

ghost commented Jun 23, 2025

@carlescufi What do you think?

@carlescufi
Copy link
Member

@tpambor sorry for the delay. I am still not quite sold on this idea. You say that the network stack is using UNALIGNED_GET/PUT() to access those members. So then my question is why do you need to pack them?
I had written an analysis a while back here: #16587 (comment), please take a look.

Also, please take a look at those PRs:
#39192
#40909

@ghost
Copy link

ghost commented Jun 25, 2025

So then my question is why do you need to pack them?

Basically, we can not have our cake and eat it too. We can

  • either say, that the alignment requirement of struct in_addr is 1 byte, then we can directly pointer cast into a network packet and code that uses it in other context may take a very slight performance hit (need to add __packed as this PR suggests)
  • or we say, that the alignment requirement of struct in_addr is 4 byte, then we can NOT directly pointer cast into a network packet and code that uses it in other context may not take the performance hit.

At the moment we try to do both, we try to say, that the alignment requirement of struct in_addr is 4 bytes (by not having a __packed and we are nonetheless using it through pointers, that are not aligned in that way (through casting directly into and incoming packet). This is inconsistent.

Edit: Of coures one could have both by adding struct in_addr_packed or similar for the one use-case and having the other for the other usecase.

@carlescufi
Copy link
Member

carlescufi commented Jun 25, 2025

  • or we say, that the alignment requirement of struct in_addr is 4 byte, then we can NOT directly pointer cast into a network packet and code that uses it in other context may not take the performance hit.

Isn't exactly this what was done in this PR: #39192?
Essentially we are saying that struct in_addr is not packed so that means that you cannot cast it arbitrarily in packets that go over the wire. Instead, in packet defintions we use a byte array and _raw functions.

If you really want to access the struct in_addr safely inside structs that may not be aligned, you can cast it to __packed struct in6_addr * and then access it directly, the compiler will do the right thing.

Regarding the warnings that you see with UBSAN, could you please provide pointers to the source code lines that trigger them to see exactly how to fix them?

@tpambor
Copy link
Contributor Author

tpambor commented Jun 25, 2025

Regarding the warnings that you see with UBSAN, could you please provide pointers to the source code lines that trigger them to see exactly how to fix them?

In the PR description you can find a list with all locations that trigger UBSAN alignment errors, when building twister tests for the network subsystem with UBSAN enabled, e.g. running ./scripts/twister -T tests/net/ -p native_sim --enable-ubsan

@carlescufi
Copy link
Member

carlescufi commented Jun 25, 2025

Regarding the warnings that you see with UBSAN, could you please provide pointers to the source code lines that trigger them to see exactly how to fix them?

In the PR description you can find a list with all locations that trigger UBSAN alignment errors, when building twister tests for the network subsystem with UBSAN enabled, e.g. running ./scripts/twister -T tests/net/ -p native_sim --enable-ubsan

Sorry, my fault, I misread this! I will take a look

@ghost
Copy link

ghost commented Jun 26, 2025

Isn't exactly this what was done in this PR: #39192?

As far as I am concerned, main goal is to ensure, that UBSAN does not complain (so it can be used effectively as a tool).
I did not (yet) have time to check, whether PR #39192 fixes this.

@tpambor
Copy link
Contributor Author

tpambor commented Jun 26, 2025

As far as I am concerned, main goal is to ensure, that UBSAN does not complain (so it can be used effectively as a tool).

Sure, see also the tracking issue for the network subsystem #90882 and #88687 for more context.

@carlescufi
Copy link
Member

carlescufi commented Jun 26, 2025

@tpambor @clamattia I went back and tried to understand why UBSAN is throwing these warnings, and I am confused. Most of the warnings involve something similar to:
UNALIGNED_GET(&addr->s6_addr32[0]) with uint32_t s6_addr32[4];.

So I am confused. The whole point of UNALIGNED_GET() is that it must work with any pointer, aligned or not. So either UNALIGNED_GET() is incorrectly implemented for the platform you are building for or I am missing something here. I will debug this further.

@tpambor
Copy link
Contributor Author

tpambor commented Jun 26, 2025

@tpambor @clamattia I went back and tried to understand why UBSAN is throwing these warnings, and I am confused. Most of the warnings involve something similar to: UNALIGNED_GET(&addr->s6_addr32[0]) with uint32_t s6_addr32[4];.

So I am confused. The whole point of UNALIGNED_GET() is that it must work with any pointer, aligned or not. So either UNALIGNED_GET() is incorrectly implemented for the platform you are building for or I am missing something here. I will debug this further.

Under standard C rules, already forming the pointer from a misaligned struct (e.g. doing ->) is UB, so already when preparing what to pass to the UNALIGNED_GET macro, UB is triggered.

You can find more information in these threads by gcc https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217#c7 and clang llvm/llvm-project#83710

Remove UNALIGNED_GET/UNALIGNED_PUT macros for struct pointers of
types in_addr, in6_addr, sockaddr_*. These structs are now packed,
allowing the compiler to handle unaligned accesses automatically.
Manual use of UNALIGNED_GET/UNALIGNED_PUT is no longer necessary
and has been removed to simplify the code.

Signed-off-by: Tim Pambor <tim.pambor@codewrights.de>
@sonarqubecloud
Copy link

@tpambor
Copy link
Contributor Author

tpambor commented Jun 26, 2025

@carlescufi I think #39192 is still needed as otherwise -Waddress-of-packed-member warnings would reappear for places where pointers to members of these structs were used.

I rebased and pushed a new commit which now removes UNALIGNED_GET/UNALIGNED_PUT macros for struct pointers of
types in_addr, in6_addr, sockaddr_*.

I executed again all networking tests with UBSAN enabled and no alignment related problems were discovered and no -Waddress-of-packed-member warnings were reported. For completness these are the remaining UBSAN errors in the network subsystem with this PR applied:

subsys/net/ip/net_pkt.c:1946:17: runtime error: left shift of 239 by 24 places cannot be represented in type 'int'
subsys/net/lib/coap/coap.c:1500:38: runtime error: left shift of negative value -1
tests/net/lib/dns_sd/src/main.c:444:24: runtime error: left shift of 177 by 24 places cannot be represented in type 'int'
subsys/net/lib/dns/dns_sd.c:899:12: runtime error: left shift of 148 by 24 places cannot be represented in type 'int'
subsys/net/lib/lwm2m/lwm2m_rw_senml_cbor.c:648:2: runtime error: null pointer passed as argument 2, which is declared to never be null
subsys/net/lib/lwm2m/lwm2m_rw_plain_text.c:203:25: runtime error: signed integer overflow: 9223372036854775800 + 8 cannot be represented in type 'long long int'
subsys/net/lib/lwm2m/lwm2m_rw_oma_tlv.c:613:12: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
subsys/net/lib/lwm2m/lwm2m_registry.c:603:6: runtime error: null pointer passed as argument 2, which is declared to never be null
subsys/net/lib/lwm2m/lwm2m_rw_json.c:666:25: runtime error: signed integer overflow: 9223372036854775800 + 8 cannot be represented in type 'long long int'
subsys/net/ip/net_pkt.c:1946:17: runtime error: left shift of 204 by 24 places cannot be represented in type 'int'
tests/net/socket/udp/src/main.c:1498:2: runtime error: null pointer passed as argument 1, which is declared to never be null
lib/os/zvfs/zvfs_select.c:70:2: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
tests/net/iface/src/main.c:239:34: runtime error: load of value 11, which is not a valid value for type '_Bool'

@jukkar jukkar requested review from a user and rlubos June 27, 2025 06:45
@ghost
Copy link

ghost commented Jun 27, 2025

For completness these are the remaining UBSAN errors in the network subsystem with this PR applied:

Consider creating a separate tracking issue for those.

@tpambor
Copy link
Contributor Author

tpambor commented Jun 27, 2025

For completness these are the remaining UBSAN errors in the network subsystem with this PR applied:

Consider creating a separate tracking issue for those.

They are tracked here, #90882. I will update the tracking issue once this PR is merged.

@carlescufi
Copy link
Member

carlescufi commented Jun 27, 2025

Sorry I could not reply earlier and thanks for your patience and ongoing work!

@carlescufi I think #39192 is still needed as otherwise -Waddress-of-packed-member warnings would reappear for places where pointers to members of these structs were used.

Are you sure? these warnings actually print: taking address of packed member of <foo> may result in an unaligned pointer value. But if you pack the addr structs then their alignment requirements immediately go down to a single byte, so I am not convinced this warning will appear. Could you please verify that?

@carlescufi
Copy link
Member

carlescufi commented Jun 27, 2025

We've been discussing today with @rlubos and @jukkar about this very topic.
Our conclusions read as follows:

  1. The actual cause of UBSAN errors are parts of the code that were never properly covered in net: Remove unpacked structure references from packed structs #39192 (for example, see this access). If we followed the approach in net: Remove unpacked structure references from packed structs #39192 in all parts of the code, UBSAN would never complain. That approach however, is relatively cumbersome and slow because it requires copying the raw bytes into a newly-allocated addr struct (remember that we have this assert and this one) to avoid mapping directly the addr structs to a chunk of a packet that is packed because it goes on the wire
  2. The available ways we have to fix this are:
  • Apply the approach of net: Remove unpacked structure references from packed structs #39192 to all the missing places
  • Change how we do things instead: create a packed internal version of the addr structs (but keep the current unpacked ones) and use this packed version to cast chunks of an on-wire packet for all addr-related operations. Then you could do: net_ipv6_is_addr_mcast_packed(struct in_addr6_packed *addr)
  • Add "untyped" internal versions of the functions that take uint8_t * and then just cast that to uint32_t * and encapsulate knowledge of the IP address format programatically outside of the structs

In any case, I am adamant to reject your approach of packing the addr stucts because we should not be using packed structs in a public API, that is not good practice and it is certainly not necessary as I explained.

@carlescufi
Copy link
Member

carlescufi commented Jun 27, 2025

@tpambor one more piece of info. When building Zephyr locally with clang (not using Godbolt) you will see that you don't get the address-of-packed-member warning because of this line:

-Wno-address-of-packed-member

EDIT: See #92322

@tpambor
Copy link
Contributor Author

tpambor commented Jun 27, 2025

@carlescufi Thanks for the explanations. I am not 100% sure I understand how fixing using the approach of #39192 would look like. Taking your example

(for example, see this access).

This approach would mean to do it like this?

	if (net_ipv6_is_addr_mcast((struct in6_addr *)hdr->src) ||
	    net_ipv6_is_addr_mcast_scope((struct in6_addr *)hdr->dst, 0)) {
		NET_DBG("DROP: multicast packet");
		goto drop;
	}

to

        struct in6_addr src_addr;
        struct in6_addr dst_addr;
        net_ipv6_addr_copy_raw((uint8_t *)&src_addr, hdr->src);
        net_ipv6_addr_copy_raw((uint8_t *)&dst_addr, hdr->dst);

	if (net_ipv6_is_addr_mcast(src_addr) ||
	    net_ipv6_is_addr_mcast_scope(dst_addr, 0)) {
		NET_DBG("DROP: multicast packet");
		goto drop;
	}

@carlescufi
Copy link
Member

carlescufi commented Jun 28, 2025

This approach would mean to do it like this?

Yes, but when we discussed this with @rlubos and @jukkar we thought it'd be better to instead use this approach:

Add "untyped" internal versions of the functions that take uint8_t * and then just cast that to uint32_t * and encapsulate knowledge of the IP address format programatically outside of the structs

In fact I can see now that #39192 was fixing it incorrectly in several places, for example:
image

which is incorrect, because you cannot just cast that array to an addr struct, you will have issues later with UBSAN like the ones you found. Instead, the correct way to do this would be:

	if (net_ipv4_is_addr_unspecified_raw(ip_hdr->src)) {
 		NET_DBG("DROP: src addr is unspecified");
 		goto drop;
 	}

bool net_ipv4_is_addr_unspecified_raw(uint8_t *addr) {
   /* do not cast addr to an in_addr struct, instead deal with the bytes directly */
}

@rlubos will look into this a bit more next week.

Are you on Zephyr's Discord? We could coordinate there, in the #networking channel.

@tpambor
Copy link
Contributor Author

tpambor commented Jun 30, 2025

This approach would mean to do it like this?

Yes, but when we discussed this with @rlubos and @jukkar we thought it'd be better to instead use this approach:

Add "untyped" internal versions of the functions that take uint8_t * and then just cast that to uint32_t * and encapsulate knowledge of the IP address format programatically outside of the structs

In fact I can see now that #39192 was fixing it incorrectly in several places, for example: image

which is incorrect, because you cannot just cast that array to an addr struct, you will have issues later with UBSAN like the ones you found. Instead, the correct way to do this would be:

	if (net_ipv4_is_addr_unspecified_raw(ip_hdr->src)) {
 		NET_DBG("DROP: src addr is unspecified");
 		goto drop;
 	}

bool net_ipv4_is_addr_unspecified_raw(uint8_t *addr) {
   /* do not cast addr to an in_addr struct, instead deal with the bytes directly */
}

I agree.

Are you on Zephyr's Discord? We could coordinate there, in the #networking channel.

Yes, I'm called there tpambor as well.

Regarding performance, I tried three ways to implement this: https://godbolt.org/z/3Gh7P9rjf

bool net_ipv4_is_addr_unspecified_raw1(const uint8_t *addr)
{
    return UNALIGNED_GET((uint32_t *)addr) == 0;
}

bool net_ipv4_is_addr_unspecified_raw2(const uint8_t *addr)
{
    struct in_addr_old _addr;

    memcpy(&_addr, addr, sizeof(_addr));

    return net_ipv4_is_addr_unspecified(&_addr);
}

bool net_ipv4_is_addr_unspecified_raw3(const uint8_t *addr)
{
    struct in_addr_old _addr;

    memcpy(&_addr, addr, sizeof(_addr));

    return _addr.s_addr == 0;
}

Using clang, all variants are optimized and compiled to the same assembly for Cortex-M0. Using GCC, memcpy is not inlined for variant _raw2, _raw3. So performance-wise, _raw1 variant might be the best way. I think variant _raw2 would have better maintainability.

When using Clang, all variants are optimized to identical assembly for Cortex-M0. However, with GCC, memcpy is not inlined in variants _raw2 and _raw3, which may impact performance. From a performance standpoint, _raw1 appears to be the most efficient. That said, _raw2 might offer better maintainability due to reusing existing net_ipv4_is_addr_unspecified.

For chips that support unaligned access, all variants compiled with Clang, as well as the _raw1 and _raw3 variants compiled with GCC, produce identical assembly. However, for the _raw2 variant, GCC unnecessarily reserves stack space, even though it is never used.

@carlescufi
Copy link
Member

Regarding performance, I tried three ways to implement this: https://godbolt.org/z/3Gh7P9rjf

Thanks for the investigation! I will leave to @rlubos, who is currently looking at fixing this throughout the code, to comment on his opinion about the different approaches.

So performance-wise, _raw1 variant might be the best way. I think variant _raw2 would have better maintainability.

I am not too surprised, it's perhaps the simplest for the compiler to deal with.

I think variant _raw2 would have better maintainability.

Yes, but at a cost. These functions are called per-packet, and adding another indirection (i.e. a second call) may seriously impact performance as you mentioned in your analysis.

@rlubos
Copy link
Contributor

rlubos commented Jul 1, 2025

Thanks @tpambor, so far I've been looking into IPv6 side of things, and I've been taking the raw1 approach so far, it seemed most reasonable. However, it seems that in the end I'll only be using this for the cirtical data path, and copy the address from the header to struct in6_addr elsewhere (like for example handling ND messages) - there was just too many dependencies (neighbor table, routing) to convert everything. I should be able to share so PoC for IPv6 a bit later.

@rlubos
Copy link
Contributor

rlubos commented Jul 1, 2025

@tpambor
Copy link
Contributor Author

tpambor commented Jul 1, 2025

Thanks @tpambor, so far I've been looking into IPv6 side of things, and I've been taking the raw1 approach so far, it seemed most reasonable. However, it seems that in the end I'll only be using this for the cirtical data path, and copy the address from the header to struct in6_addr elsewhere (like for example handling ND messages) - there was just too many dependencies (neighbor table, routing) to convert everything. I should be able to share so PoC for IPv6 a bit later.

I agree, I looked at your PoC and it looks reasonable. Just to let you know, I re-run the tests and for IPv6 there are two more locations which trigger UBSAN

2025-07-01 14:34:00,551 - twister - DEBUG - OUTPUT: /home/user/west_workspace/zephyr/include/zephyr/net/net_ip.h:716:42: runtime error: member access within misaligned address 0x08147cde for type 'const struct in6_addr', which requires 4 byte alignment
2025-07-01 14:34:00,552 - twister - DEBUG - OUTPUT: START - test_ipv6_both_specified_bad
2025-07-01 14:34:00,555 - twister - DEBUG - OUTPUT:  PASS - test_ipv6_both_specified_bad in 0.000 seconds
2025-07-01 14:34:00,556 - twister - DEBUG - OUTPUT: 0x08147cde: note: pointer points here
2025-07-01 14:34:00,557 - twister - DEBUG - OUTPUT:  00 08 3a 40 20 01  0d b8 00 00 00 00 00 00  00 00 00 00 00 02 20 01  0d b8 00 00 00 00 00 00  00 00
2025-07-01 14:34:00,560 - twister - DEBUG - OUTPUT:              ^
2025-07-01 14:34:00,564 - twister - DEBUG - OUTPUT: ===================================================================
2025-07-01 14:34:00,564 - twister - DEBUG - OUTPUT:     #0 0x080ac59f in net_ipv6_is_addr_mcast /home/user/west_workspace/zephyr/include/zephyr/net/net_ip.h:716:42
2025-07-01 14:34:00,568 - twister - DEBUG - OUTPUT:     #1 0x080ac59f in net_ipv6_create /home/user/west_workspace/zephyr/subsys/net/ip/ipv6.c:90:7
2025-07-01 14:34:00,568 - twister - DEBUG - OUTPUT: START - test_ipv6_both_specified_good
2025-07-01 14:34:00,570 - twister - DEBUG - OUTPUT: 
2025-07-01 14:34:00,572 - twister - DEBUG - OUTPUT:  PASS - test_ipv6_both_specified_good in 0.000 seconds
2025-07-01 14:34:00,575 - twister - DEBUG - OUTPUT: SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use /home/user/west_workspace/zephyr/include/zephyr/net/net_ip.h:716:42

2025-07-01 14:39:54,036 - twister - DEBUG - OUTPUT: /home/user/west_workspace/zephyr/subsys/net/ip/utils.c:179:26: runtime error: member access within misaligned address 0xf45fb1e2 for type 'struct in6_addr', which requires 4 byte alignment
2025-07-01 14:39:54,048 - twister - DEBUG - OUTPUT: 0xf45fb1e2: note: pointer points here
2025-07-01 14:39:54,056 - twister - DEBUG - OUTPUT:  02 00  00 00 ff 02 00 00 00 00  00 00 00 00 00 00 00 00  00 01 02 00 00 00 ff 02  00 00 00 00 00 00
2025-07-01 14:39:54,063 - twister - DEBUG - OUTPUT:               ^
2025-07-01 14:39:54,080 - twister - DEBUG - OUTPUT:     #0 0x080ad2c8 in z_impl_net_addr_ntop /home/user/west_workspace/zephyr/subsys/net/ip/utils.c:179:26
2025-07-01 14:39:54,092 - twister - DEBUG - OUTPUT: 
2025-07-01 14:39:54,108 - twister - DEBUG - OUTPUT: SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use /home/user/west_workspace/zephyr/subsys/net/ip/utils.c:179:26

@rlubos
Copy link
Contributor

rlubos commented Jul 1, 2025

Thanks @tpambor, oddly, I was not able to reproduce those failures locally but there were definitely issues in utils and ICMPv6 code (that must be where net_ipv6_create() was misused). I have updated my branch with the following:

  • IPv4 rework
  • Fixes to the issues you've reported
  • Some extra fixes for the errors I've encountered:

https://github.com/zephyrproject-rtos/zephyr/compare/main...rlubos:net/ip-addr-casting-cleanup?expand=1

I'll do some extra testing, but unless I encounter some blocker I think I could create a PR tomorrow.

@tpambor
Copy link
Contributor Author

tpambor commented Jul 2, 2025

Thanks @tpambor, oddly, I was not able to reproduce those failures locally but there were definitely issues in utils and ICMPv6 code (that must be where net_ipv6_create() was misused). I have updated my branch with the following:

  • IPv4 rework
  • Fixes to the issues you've reported
  • Some extra fixes for the errors I've encountered:

https://github.com/zephyrproject-rtos/zephyr/compare/main...rlubos:net/ip-addr-casting-cleanup?expand=1

I'll do some extra testing, but unless I encounter some blocker I think I could create a PR tomorrow.

Nice work, thank you! I encountered one more for IPv4, in net/checksum_offload/net.offload

START - test_rx_chksum_offload_disabled_test_v4_icmp_frag
/home/user/west_workspace/zephyr/include/zephyr/net/net_ip.h:998:38: runtime error: member access within misaligned address 0x082b12c6 for type 'const struct in_addr', which requires 4 byte alignment
0x082b12c6: note: pointer points here
 40 01 f8 ee c0 00  02 02 c0 00 02 01 c0 c1  c2 c3 c4 c5 c6 c7 c8 c9  ca cb cc cd ce cf d0 d1  d2 d3
             ^ 
    #0 0x08127d7d in net_ipv4_addr_cmp /home/user/west_workspace/zephyr/include/zephyr/net/net_ip.h:998:38
    #1 0x0818535d in reassembly_get /home/user/west_workspace/zephyr/subsys/net/ip/ipv4_fragment.c:43:7
    #2 0x0818e04a in net_ipv4_handle_fragment_hdr /home/user/west_workspace/zephyr/subsys/net/ip/ipv4_fragment.c:334:10
    #3 0x0816f1b8 in net_ipv4_input /home/user/west_workspace/zephyr/subsys/net/ip/ipv4.c:379:11
    #4 0x081079aa in process_data /home/user/west_workspace/zephyr/subsys/net/ip/net_core.c:136:11
    #5 0x08102851 in processing_data /home/user/west_workspace/zephyr/subsys/net/ip/net_core.c:154:10
    #6 0x08103d21 in net_rx /home/user/west_workspace/zephyr/subsys/net/ip/net_core.c:501:2
    #7 0x08103c7c in net_process_rx_packet /home/user/west_workspace/zephyr/subsys/net/ip/net_core.c:513:2
    #8 0x0815d8a0 in tc_rx_handler /home/user/west_workspace/zephyr/subsys/net/ip/net_tc.c:310:3
    #9 0x080cb78d in z_thread_entry /home/user/west_workspace/zephyr/lib/os/thread_entry.c:48:2
    #10 0x080e28bc in posix_arch_thread_entry /home/user/west_workspace/zephyr/arch/posix/core/thread.c:96:2
    #11 0x081d57c1 in nct_thread_starter /home/user/west_workspace/zephyr/scripts/native_simulator/common/src/nct.c:290:2
    #12 0xf7c60fe6 in start_thread nptl/pthread_create.c:447:8
    #13 0xf7cf85a7 in clone3 misc/../sysdeps/unix/sysv/linux/i386/clone3.S:111

SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use /home/user/west_workspace/zephyr/include/zephyr/net/net_ip.h:998:38

And another one for IPv6 (though not alignment related):

START - test_rx_chksum_offload_disabled_test_v6_icmp_frag
/home/user/west_workspace/zephyr/subsys/net/ip/net_pkt.c:1946:17: runtime error: left shift of 239 by 24 places cannot be represented in type 'int'
    #0 0x08152863 in net_pkt_read_be32 /home/user/west_workspace/zephyr/subsys/net/ip/net_pkt.c:1946:17
    #1 0x08182b7e in net_ipv6_handle_fragment_hdr /home/user/west_workspace/zephyr/subsys/net/ip/ipv6_fragment.c:491:6
    #2 0x081769b2 in net_ipv6_input /home/user/west_workspace/zephyr/subsys/net/ip/ipv6.c:736:12
    #3 0x0810792b in process_data /home/user/west_workspace/zephyr/subsys/net/ip/net_core.c:134:11
    #4 0x08102851 in processing_data /home/user/west_workspace/zephyr/subsys/net/ip/net_core.c:154:10
    #5 0x08103d21 in net_rx /home/user/west_workspace/zephyr/subsys/net/ip/net_core.c:501:2
    #6 0x08103c7c in net_process_rx_packet /home/user/west_workspace/zephyr/subsys/net/ip/net_core.c:513:2
    #7 0x0815d8a0 in tc_rx_handler /home/user/west_workspace/zephyr/subsys/net/ip/net_tc.c:310:3
    #8 0x080cb78d in z_thread_entry /home/user/west_workspace/zephyr/lib/os/thread_entry.c:48:2
    #9 0x080e28bc in posix_arch_thread_entry /home/user/west_workspace/zephyr/arch/posix/core/thread.c:96:2
    #10 0x081d57c1 in nct_thread_starter /home/user/west_workspace/zephyr/scripts/native_simulator/common/src/nct.c:290:2
    #11 0xf7c7afe6 in start_thread nptl/pthread_create.c:447:8
    #12 0xf7d125a7 in clone3 misc/../sysdeps/unix/sysv/linux/i386/clone3.S:111

Quick fix:

int net_pkt_read_be32(struct net_pkt *pkt, uint32_t *data)
{
	uint8_t d32[4];
	int ret;

	ret = net_pkt_read(pkt, d32, sizeof(uint32_t));

-	*data = d32[0] << 24 | d32[1] << 16 | d32[2] << 8 | d32[3];
+	*data = (uint32_t)d32[0] << 24 | (uint32_t)d32[1] << 16 | (uint32_t)d32[2] << 8 | (uint32_t)d32[3];

	return ret;
}

For testing I used this branch, https://github.com/tpambor/zephyr/commits/fixes-ubsan/, which also has the other pending UBSAN Pull Requests merged, as sometimes behind one runtime error, more runtime errors hide.

@rlubos
Copy link
Contributor

rlubos commented Jul 2, 2025

@tpambor Thanks! I've caught those as well, should be fixed now. net/checksum_offload/net.offload failure was due to IP fragmentation code.

@rlubos
Copy link
Contributor

rlubos commented Jul 2, 2025

So here's the alternative PR: #92539. Combined with fixes from other PRs it seems to fix all UBSAN complaints in the networking code.

@tpambor
Copy link
Contributor Author

tpambor commented Jul 2, 2025

I will close this PR in favor of #92539.

@tpambor tpambor closed this Jul 2, 2025
tpambor added a commit to endresshauser-lp/zephyr that referenced this pull request Jul 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants