New code-gen options for retpolines and straight line speculation #51665

andyhhp · 2021-10-26T15:55:39Z


Bugzilla Link	52323
Version	unspecified
OS	Linux
Blocks	#4440
CC	@andyhhp,@chandlerc,@DougGregor,@efriedma-quic,@jyknight,@m-gupta,@nickdesaulniers,@pageexec,@phoebewang,@zygoloid,@rnk

Extended Description

Hello

[FYI, this is being cross-requested of GCC too]

Linux and other kernel level software makes use of -mindirect-branch=thunk-extern to be able to alter the handling of indirect branches at boot. It turns out to be advantageous to inline the thunks when retpoline is not in use. https://lore.kernel.org/lkml/20211026120132.613201817@infradead.org/ is some infrastructure to make this work.

In some cases, we want to be able to inline an lfence; jmp *%reg thunk. This is fine for the low 8 registers, but not fine for %r{8..15} where the REX prefix pushes the replacement size to being 6 bytes.

It would be very useful to have a code-gen option to write out call %cs:__x86_indirect_thunk_r{8..15} where the redundant %cs prefix will increase the instruction length to 6, allowing the non-retpoline form to be inlined.

Relatedly, x86 straight line speculation has been discussed before, but without any action taken. It would be helpful to have a code gen option which would emit int3 following any ret instruction, and any indirect jump, as neither of these two cases have following architectural execution.

The reason these two are related is that if both options are in use, we want an extra byte of replacement space to be able to inline lfence; jmp *%reg; int3.

Third Clang has been observed to spot conditional tail calls as Jcc __x86_indirect_thunk_*. This is a 6 byte source size, but needs up to 9 bytes of space for inlining including an int3 for straight line speculation reasons (See https://lore.kernel.org/lkml/20211026120310.359986601@infradead.org/ for full details). It might be enough to simply prohibit an optimisation like this when trying to pad retpolines for inlineability.

The text was updated successfully, but these errors were encountered:

andyhhp · 2021-10-26T15:56:32Z

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952 for GCC cross-request.

nickdesaulniers · 2021-11-18T22:01:03Z

It looks like GCC has added support for -mindirect-branch-cs-prefix:

https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=2196a681d7810ad8b227bf983f38ba716620545e

This is being used when available in the Linux kernel:

https://lore.kernel.org/lkml/20211118185421.GK174703@worktop.programming.kicks-ass.net/

efriedma-quic · 2021-11-19T18:49:31Z

Relatedly, x86 straight line speculation has been discussed before, but
without any action taken. It would be helpful to have a code gen option
which would emit int3 following any ret instruction, and any indirect
jump, as neither of these two cases have following architectural execution.

Is there documentation somewhere describing this mitigation? In particular:

What unconditional branches can lead straight-line speculation?
What instructions can be used to stop speculation? (Is int3 actually effective? Are there other instructions that would also work?)

andyhhp · 2021-11-20T13:17:48Z

Relatedly, x86 straight line speculation has been discussed before, but
without any action taken. It would be helpful to have a code gen option
which would emit int3 following any ret instruction, and any indirect
jump, as neither of these two cases have following architectural execution.

Is there documentation somewhere describing this mitigation? In particular:

What unconditional branches can lead straight-line speculation?

For AMD, it is discussed here https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf, mitigation G-5 on the final page:

Place an LFENCE after an indirect branch instruction (RET, JMP reg or mem,
CALL reg or mem) to help prevent possible sequential speculation.

For Intel, notes are included in SDM Vol2 for the CALL and JMP instructions:

Certain situations may lead to the next sequential instruction after a
near indirect CALL being speculatively executed. If software needs to
prevent this (e.g., in order to prevent a speculative execution side
channel), then an LFENCE instruction opcode can be placed after the near
indirect CALL in order to block speculative execution.

What instructions can be used to stop speculation? (Is int3 actually
effective? Are there other instructions that would also work?)

As you can see, LFENCE is the official recommendation. It is about the only option for halting speculation which is safe to actually execute, and don't otherwise impact program state.

CALL has architectural execution following it. However, the code following a CALL instruction is typically preservation of the return value and a pile of dead registers wanting reloading, and is typically not a pointer deference involving a callee-clobbered register. Therefore, CALL's are unlikely to have subsequent instructions which are vulnerable to speculative type confusion, and are therefore uninteresting to protect.

JMP and RET are different. They are followed by arbitrary unrelated basic blocks, which could contain anything.

We could use LFENCE everywhere. However, as we don't architecturally execute the instruction, we don't care about architectural side effects. Basically any instruction which causes a decode exception, or is microcoded, halts speculation. INT3 is safe to use, and is 1/3 of the length of LFENCE, so has less of an impact on code size.

efriedma-quic · 2021-11-22T20:36:46Z

CALL has architectural execution following it. However, the code following
a CALL instruction is typically preservation of the return value and a pile
of dead registers wanting reloading, and is typically not a pointer
deference involving a callee-clobbered register.

I'm a bit skeptical of heuristics like this; it's making very specific assumptions about how the compiler generates code, which might not hold for different codebases and/or optimizations.

We could use LFENCE everywhere. However, as we don't architecturally
execute the instruction, we don't care about architectural side effects.
Basically any instruction which causes a decode exception, or is microcoded,
halts speculation. INT3 is safe to use, and is 1/3 of the length of LFENCE,
so has less of an impact on code size.

It looks like the current version of Intel manual actually explicitly mentions INT3, so I guess that's fine.

andyhhp · 2021-11-22T22:00:07Z

It looks like the current version of Intel manual actually explicitly
mentions INT3, so I guess that's fine.
Ah great - I'd missed that update coming though. I'll pester the other guys to document too.

CALL has architectural execution following it. However, the code following
a CALL instruction is typically preservation of the return value and a pile
of dead registers wanting reloading, and is typically not a pointer
deference involving a callee-clobbered register.

I'm a bit skeptical of heuristics like this; it's making very specific
assumptions about how the compiler generates code, which might not hold for
different codebases and/or optimizations.
Nevertheless, protecting JMP/RET with an INT3 is easy and cheap, while protecting CALL with LFENCE is very much not, and risk profiles of the code is very different.

My gut feeling is that anyone wanting protection in the CALL case would probably be using Speculative Load Hardening instead.

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021

Endilll added the clang:to-be-triaged Should not be used for new issues label Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New code-gen options for retpolines and straight line speculation #51665

New code-gen options for retpolines and straight line speculation #51665

andyhhp mannequin commented Oct 26, 2021 •

edited by Endilll

Loading

andyhhp mannequin commented Oct 26, 2021

nickdesaulniers commented Nov 18, 2021

efriedma-quic commented Nov 19, 2021

andyhhp mannequin commented Nov 20, 2021

efriedma-quic commented Nov 22, 2021

andyhhp mannequin commented Nov 22, 2021

New code-gen options for retpolines and straight line speculation #51665

New code-gen options for retpolines and straight line speculation #51665

Comments

andyhhp mannequin commented Oct 26, 2021 • edited by Endilll Loading

Extended Description

andyhhp mannequin commented Oct 26, 2021

nickdesaulniers commented Nov 18, 2021

efriedma-quic commented Nov 19, 2021

andyhhp mannequin commented Nov 20, 2021

efriedma-quic commented Nov 22, 2021

andyhhp mannequin commented Nov 22, 2021

andyhhp mannequin commented Oct 26, 2021 •

edited by Endilll

Loading