Description
Gist
- Incorrect code gen causes a conditional jump based on load of data from uninitialized memory location on the stack.
- Issue was reproduced on release builds on both x64 and ARM64 with Rust stable 1.32
- Issue can be mitigated by restricting codegen-units to 1 (and using a nightly build)
- Issue can be observed under valgrind / gdb. Note that the code generated by abstract common patterns in AST formatters #3 is very similar, but there's a control flow change that prevents the code path for Thread a session or semantic context through IL #1 from being executed
- Issue is likely related to an observed SEGV in 100% safe Rust code
Background:
This was discovered during our attempt to root cause an inexplicable SEGV error in 100% safe Rust code. The short version is that the SEGV was being caused by a the corruption of a doubly linked list structure used by malloc() and the structure was ostensibly corrupted when the pointer was somehow overwritten. Since the issue made absolutely no sense, we tried to distill it down and were surprised when valgrind showed that our test code resulted in a conditional execution based on uninitialized data.
** Error and analysis **
-
valgrind error
valgrind --track-origins=yes target/release/bug_repro
...
==32239== Conditional jump or move depends on uninitialised value(s)
==32239== at 0x113B86: <serde_cbor::de::Deserializer>::parse_map
==32239== by 0x10E16A: <serde_cbor::de::Deserializer>::parse_value
==32239== by 0x10CB27: serde_cbor::de::from_slice
==32239== by 0x114AFC: bug_repro::main
...
==32239== Uninitialised value was created by a stack allocation
==32239== at 0x1137B3: <serde_cbor::de::Deserializer>::parse_map (in /home/ANT.AMAZON.COM/khareatu/kaos/Carbon2/src/UGMH/target/release/bug_repro) -
We can validate the above by the following with gdb:
(gdb) info functions parse_map
0x00000000001137c0 <serde_cbor::de::Deserializer>::parse_map
(gdb) disas 0x1137c0
Dump of assembler code for function ZN46$LT$serde_cbor..de..Deserializer$LT$R$GT$$GT$9parse_map17h2e6263b3d6f0a8b5E:
...
0x00000000001137ca <+10>: sub $0xf8,%rsp => Stack frame is allocated here
0x00000000001137d1 <+17>: mov %rsi,%r13 => We set a break point on this instruction
Breakpoint 2, 0x00000000001137d1 in <serde_cbor::de::Deserializer>::parse_map ()
(gdb) info registers
...
rsp 0x1ffefff6d0 0x1ffefff6d0
valgrind shows that uninitialized data will be located at a future point in the program:
0x0000000000113b6f <+943>: mov 0x90(%rsp),%rsi => Load happens here
0x0000000000113b77 <+951>: cmp $0x1,%r12
...
0x0000000000113b8b <+971>: test %rsi,%rsi
0x0000000000113b8e <+974>: mov 0x80(%rsp),%rbp
0x0000000000113b96 <+982>: je 0x113bbd => Jump based on the above load
We can dump the random bytes allocated on the stack and set a break on 0x113b8e:
(gdb) x/4xg 0x1ffefff6d0+0x90
0x1ffefff760: 0x00000000042289f0 0x0000000004044000
0x1ffefff770: 0x0000000004028440 0x0000000004028930
(gdb) x/xg 42289f0
break *0x113b8e
continue
.... => We break following the load of rsi from [rsp+0x90]
0x0000000000113b96 in <serde_cbor::de::Deserializer>::parse_map ()
(gdb) info registers
....
rsi 0x42289f0 69372400
We can see that this is the same random value @ [rsp+0x90] on the stack frame. In fact, we can further verify this by changing the value in [rsp+0x90] to some other value to ensure that it's not initialized later in the control flow.