Skip to content

Conversation

MabezDev
Copy link
Contributor

@MabezDev MabezDev commented Oct 3, 2025

This implements the asm! support for Xtensa. We've been using this code for a few years in our fork and it's been working well. I finally found some time to clean it up a bit and start the upstreaming process. This should be one of the final PRs for Xtensa support on the Rust side (minus bug fixes of course). After this, we're mostly just waiting on the LLVM upstreaming which is going well. This PR doesn't cover all possible asm options for Xtensa, but the base ISA plus a few extras that are used in Espressif chips.

r? Amanieu

@rustbot
Copy link
Collaborator

rustbot commented Oct 3, 2025

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 3, 2025
@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Oct 3, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

Co-authored-by: Taiki Endo <te316e89@gmail.com>
Co-authored-by: Kerry Jones <kerry@iodrive.co.za>
@rust-log-analyzer
Copy link
Collaborator

The job aarch64-gnu-llvm-20-1 failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
---- [ui] tests/ui/check-cfg/target_feature.rs stdout ----
Saved the actual stderr to `/checkout/obj/build/aarch64-unknown-linux-gnu/test/ui/check-cfg/target_feature/target_feature.stderr`
diff of stderr:

29 `amx-tile`
30 `amx-transpose`
31 `apxf`
+ `atomctl`
32 `atomics`
33 `avx`
34 `avx10.1`

64 `cache`
65 `cmpxchg16b`
---
108 `fp-armv8`
109 `fp16`
110 `fp64`

130 `hbc`
131 `high-registers`
132 `high-word`
+ `highpriinterrupts`
133 `hvx`
134 `hvx-length128b`
135 `hwdiv`

136 `i8mm`
+ `interrupt`
137 `isa-68000`
138 `isa-68010`
139 `isa-68020`

151 `lbt`
152 `ld-seq-sa`
153 `leoncasa`
+ `loop`
154 `lor`
155 `lse`
156 `lse128`

160 `lvz`
161 `lzcnt`
162 `m`
+ `mac16`
---
181 `multivalue`
182 `mutable-globals`
183 `neon`

184 `nnp-assist`
185 `nontrapping-fptoint`
+ `nsa`
186 `nvic`
187 `outline-atomics`
188 `paca`

200 `power9-altivec`
201 `power9-vector`
202 `prfchw`
+ `prid`
---
To only update this specific test, also pass `--test-args check-cfg/target_feature.rs`

error: 1 errors occurred comparing output.
status: exit status: 0
command: env -u RUSTC_LOG_COLOR RUSTC_ICE="0" RUST_BACKTRACE="short" "/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/tests/ui/check-cfg/target_feature.rs" "-Zthreads=1" "-Zsimulate-remapped-rust-src-base=/rustc/FAKE_PREFIX" "-Ztranslate-remapped-path-to-local-path=no" "-Z" "ignore-directory-in-diagnostics-source-blocks=/cargo" "-Z" "ignore-directory-in-diagnostics-source-blocks=/checkout/vendor" "--sysroot" "/checkout/obj/build/aarch64-unknown-linux-gnu/stage2" "--target=aarch64-unknown-linux-gnu" "--error-format" "json" "--json" "future-incompat" "-Ccodegen-units=1" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zwrite-long-types-to-disk=no" "-Cstrip=debuginfo" "--emit" "metadata" "-C" "prefer-dynamic" "--out-dir" "/checkout/obj/build/aarch64-unknown-linux-gnu/test/ui/check-cfg/target_feature" "-A" "unused" "-A" "internal_features" "-A" "unused_parens" "-A" "unused_braces" "-Crpath" "-Cdebuginfo=0" "-Lnative=/checkout/obj/build/aarch64-unknown-linux-gnu/native/rust-test-helpers" "--check-cfg=cfg()" "-Zcheck-cfg-all-expected"
stdout: none
--- stderr -------------------------------
warning: unexpected `cfg` condition value: `_UNEXPECTED_VALUE`
##[warning]  --> /checkout/tests/ui/check-cfg/target_feature.rs:16:10
   |
LL |     cfg!(target_feature = "_UNEXPECTED_VALUE");
   |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: expected values for `target_feature` are: `10e60`, `2e3`, `32s`, `3e3r1`, `3e3r2`, `3e3r3`, `3e7`, `7e10`, `a`, `aclass`, `adx`, `aes`, `altivec`, `alu32`, `amx-avx512`, `amx-bf16`, `amx-complex`, `amx-fp16`, `amx-fp8`, `amx-int8`, `amx-movrs`, `amx-tf32`, `amx-tile`, `amx-transpose`, `apxf`, `atomctl`, `atomics`, `avx`, `avx10.1`, `avx10.2`, `avx2`, `avx512bf16`, `avx512bitalg`, `avx512bw`, `avx512cd`, `avx512dq`, `avx512f`, `avx512fp16`, `avx512ifma`, `avx512vbmi`, `avx512vbmi2`, `avx512vl`, `avx512vnni`, `avx512vp2intersect`, `avx512vpopcntdq`, `avxifma`, `avxneconvert`, `avxvnni`, `avxvnniint16`, `avxvnniint8`, `b`, `backchain`, `bf16`, `bmi1`, `bmi2`, `bti`, `bulk-memory`, `c`, `cache`, `cmpxchg16b`, `concurrent-functions`, `coprocessor`, `crc`, `crt-static`, `cssc`, `d`, `d32`, `debug`, `deflate-conversion`, `dit`, `div32`, `doloop`, `dotprod`, `dpb`, `dpb2`, `dsp`, `dsp1e2`, `dspe60`, `e`, `e1`, `e2`, `ecv`, `edsp`, `elrw`, `enhanced-sort`, `ermsb`, `exception`, `exception-handling`, `extended-const`, `extendedl32r`, `f`, `f16c`, `f32mm`, `f64mm`, `faminmax`, `fcma`, `fdivdu`, `fhm`, `flagm`, `flagm2`, `float1e2`, `float1e3`, `float3e4`, `float7e60`, `floate1`, `fma`, `fp`, `fp-armv8`, `fp16`, `fp64`, `fp8`, `fp8dot2`, `fp8dot4`, `fp8fma`, `fpregs`, `fpuv2_df`, `fpuv2_sf`, `fpuv3_df`, `fpuv3_hf`, `fpuv3_hi`, `fpuv3_sf`, `frecipe`, `frintts`, `fxsr`, `gfni`, `guarded-storage`, `hard-float`, `hard-float-abi`, `hard-tp`, `hbc`, `high-registers`, `high-word`, `highpriinterrupts`, `hvx`, `hvx-length128b`, `hwdiv`, `i8mm`, `interrupt`, `isa-68000`, `isa-68010`, `isa-68020`, `isa-68030`, `isa-68040`, `isa-68060`, `isa-68881`, `isa-68882`, `jsconv`, `kl`, `lahfsahf`, `lam-bh`, `lamcas`, `lasx`, `lbt`, `ld-seq-sa`, `leoncasa`, `loop`, `lor`, `lse`, `lse128`, `lse2`, `lsx`, `lut`, `lvz`, `lzcnt`, `m`, `mac16`, `mclass`, `memctl`, `message-security-assist-extension12`, `message-security-assist-extension3`, `message-security-assist-extension4`, `message-security-assist-extension5`, `message-security-assist-extension8`, `message-security-assist-extension9`, `miscellaneous-extensions-2`, `miscellaneous-extensions-3`, `miscellaneous-extensions-4`, `miscsr`, `mops`, `movbe`, `movrs`, `mp`, `mp1e2`, `msa`, `msync`, `mte`, `mul32`, `mul32high`, `multivalue`, `mutable-globals`, `neon`, `nnp-assist`, `nontrapping-fptoint`, `nsa`, `nvic`, `outline-atomics`, `paca`, `pacg`, `pan`, `partword-atomics`, `pauth-lr`, `pclmulqdq`, `pmuv3`, `popcnt`, `power10-vector`, `power8-altivec`, `power8-crypto`, `power8-vector`, `power9-altivec`, `power9-vector`, `prfchw`, `prid`, `ptx32`, `ptx40`, `ptx41`, `ptx42`, `ptx43`, `ptx50`, `ptx60`, `ptx61`, `ptx62`, `ptx63`, `ptx64`, `ptx65`, `ptx70`, `ptx71`, `ptx72`, `ptx73`, `ptx74`, `ptx75`, `ptx76`, `ptx77`, `ptx78`, `ptx80`, `ptx81`, `ptx82`, `ptx83`, `ptx84`, `ptx85`, `ptx86`, `ptx87`, `quadword-atomics`, `rand`, `ras`, `rclass`, `rcpc`, `rcpc2`, `rcpc3`, `rdm`, `rdrand`, `rdseed`, `reference-types`, `regprotect`, `relax`, `relaxed-simd`, `rtm`, `rva23u64`, `rvector`, `s32c1i`, `sb`, `scq`, `sext`, `sha`, `sha2`, `sha3`, `sha512`, `sign-ext`, `simd128`, `sm3`, `sm4`, `sm_100`, `sm_100a`, `sm_101`, `sm_101a`, `sm_120`, `sm_120a`, `sm_20`, `sm_21`, `sm_30`, `sm_32`, `sm_35`, `sm_37`, `sm_50`, `sm_52`, `sm_53`, `sm_60`, `sm_61`, `sm_62`, `sm_70`, `sm_72`, `sm_75`, `sm_80`, `sm_86`, `sm_87`, `sm_89`, `sm_90`, `sm_90a`, `sme`, `sme-b16b16`, `sme-f16f16`, `sme-f64f64`, `sme-f8f16`, `sme-f8f32`, `sme-fa64`, `sme-i16i64`, `sme-lutv2`, `sme2`, `sme2p1`, `soft-float`, `spe`, `ssbs`, `sse`, `sse2`, `sse3`, `sse4.1`, `sse4.2`, `sse4a`, `ssse3`, `ssve-fp8dot2`, `ssve-fp8dot4`, `ssve-fp8fma`, `supm`, `sve`, `sve-b16b16`, `sve2`, `sve2-aes`, `sve2-bitperm`, `sve2-sha3`, `sve2-sm4`, `sve2p1`, `tail-call`, `tbm`, `threadptr`, `thumb-mode`, `thumb2`, `timerint`, `tme`, `transactional-execution`, `trust`, `trustzone`, `ual`, `unaligned-scalar-mem`, `unaligned-vector-mem`, `v`, `v5te`, `v6`, `v6k`, `v6t2`, `v7`, `v8`, `v8.1a`, `v8.2a`, `v8.3a`, `v8.4a`, `v8.5a`, `v8.6a`, `v8.7a`, `v8.8a`, `v8.9a`, `v8plus`, `v9`, `v9.1a`, `v9.2a`, `v9.3a`, `v9.4a`, `v9.5a`, `v9a`, `vaes`, `vdsp2e60f`, `vdspv1`, `vdspv2`, `vector`, `vector-enhancements-1`, `vector-enhancements-2`, `vector-enhancements-3`, `vector-packed-decimal`, `vector-packed-decimal-enhancement`, `vector-packed-decimal-enhancement-2`, `vector-packed-decimal-enhancement-3`, `vfp2`, `vfp3`, `vfp4`, `vh`, `virt`, `virtualization`, `vpclmulqdq`, `vsx`, `wfxt`, `wide-arithmetic`, `widekl`, `windowed`, `x87`, `xop`, `xsave`, `xsavec`, `xsaveopt`, `xsaves`, `za128rs`, `za64rs`, `zaamo`, `zabha`, `zacas`, `zalrsc`, `zama16b`, `zawrs`, `zba`, `zbb`, `zbc`, `zbkb`, `zbkc`, `zbkx`, `zbs`, `zca`, `zcb`, `zcmop`, `zdinx`, `zfa`, `zfbfmin`, `zfh`, `zfhmin`, `zfinx`, `zhinx`, `zhinxmin`, `zic64b`, `zicbom`, `zicbop`, `zicboz`, `ziccamoa`, `ziccif`, `zicclsm`, `ziccrse`, `zicntr`, `zicond`, `zicsr`, `zifencei`, `zihintntl`, `zihintpause`, `zihpm`, `zimop`, `zk`, `zkn`, `zknd`, `zkne`, `zknh`, `zkr`, `zks`, `zksed`, `zksh`, `zkt`, `ztso`, `zvbb`, `zvbc`, `zve32f`, `zve32x`, `zve64d`, `zve64f`, `zve64x`, `zvfbfmin`, `zvfbfwma`, `zvfh`, `zvfhmin`, `zvkb`, `zvkg`, `zvkn`, `zvknc`, `zvkned`, `zvkng`, `zvknha`, `zvknhb`, `zvks`, `zvksc`, `zvksed`, `zvksg`, `zvksh`, `zvkt`, `zvl1024b`, `zvl128b`, `zvl16384b`, `zvl2048b`, `zvl256b`, `zvl32768b`, `zvl32b`, `zvl4096b`, `zvl512b`, `zvl64b`, `zvl65536b`, and `zvl8192b`
   = note: see <https://doc.rust-lang.org/nightly/rustc/check-cfg.html> for more information about checking conditional configuration
   = note: `#[warn(unexpected_cfgs)]` on by default

warning: 1 warning emitted
------------------------------------------

// Custom TIE extensions - https://en.wikipedia.org/wiki/Tensilica_Instruction_Extension
// Espressif specific, and are checked validated on the cpu name
gpio_out: reg = ["gpio_out"] % has_gpio_out,
expstate: reg = ["expstate"] % has_expstate,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of these seem to be control registers. For most targets you wouldn't directly ask the value to be present in the target register at the start of the inline asm block, but rather put it in a gpr and then use whichever special instruction is necessary to move it into this control register (wrmsr or mov crN, ...on x86,msron arm,csrrw` on riscv, ...) inside the asm block and in fact I don't think you can tell LLVM to do this kind of write to control registers.

Aren't writes to control registers generally effectively side-effectful? Making that go through the regular register allocator seems like it could reorder the writes in way that may cause unintended behavior. Or are you special casing writes to those control registers to happen right at the start of the inline asm block?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general rule, only 2 kinds of registers need to be made available as inline asm operands:

  • Registers which can hold values as inputs and outputs
  • Registers which are not preserved across calls, so that they can be specified as clobbers (needed for clobber_abi).

Everything else isn't needed here.

Copy link
Member

@Amanieu Amanieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also missing an update to the unstable book entry for asm_experimental_arch.

View changes since this review


#error = ["a0"] => "a0 is used internally by LLVM and cannot be used as an operand for inline asm",
#error = ["sp", "a1"] => "sp is used internally by LLVM and cannot be used as an operand for inline asm",
#error = ["a7"] => "a7 is used internally by LLVM as a frame pointer and cannot be used as an operand for inline asm",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The frame pointer can be either a7 or a15 depending on the ABI.

// Custom TIE extensions - https://en.wikipedia.org/wiki/Tensilica_Instruction_Extension
// Espressif specific, and are checked validated on the cpu name
gpio_out: reg = ["gpio_out"] % has_gpio_out,
expstate: reg = ["expstate"] % has_expstate,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general rule, only 2 kinds of registers need to be made available as inline asm operands:

  • Registers which can hold values as inputs and outputs
  • Registers which are not preserved across calls, so that they can be specified as clobbers (needed for clobber_abi).

Everything else isn't needed here.

def_reg_class! {
Xtensa XtensaInlineAsmRegClass {
reg,
freg,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM doesn't seem to support this. XtensaTargetLowering::getRegForInlineAsmConstraint only handles r constraints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants