Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile constant SIMD initialiser to a constant vector expression #18147

Closed
huonw opened this issue Oct 18, 2014 · 5 comments
Closed

Compile constant SIMD initialiser to a constant vector expression #18147

huonw opened this issue Oct 18, 2014 · 5 comments
Labels
A-codegen Area: Code generation A-SIMD Area: SIMD (Single Instruction Multiple Data) C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@huonw
Copy link
Member

huonw commented Oct 18, 2014

Currently

#![crate_type = "lib"]

pub fn foo(x: f64, y:f64) -> std::simd::f64x2 {
    std::simd::f64x2(0.0, 1.0)
}

becomes, with no optimisations,

; Function Attrs: uwtable
define <2 x double> @_ZN3foo20h36a71d373a6347d3daaE(double, double) unnamed_addr #0 {
entry-block:
  %sret_slot = alloca <2 x double>
  %x = alloca double
  %y = alloca double
  store double %0, double* %x
  store double %1, double* %y
  %2 = getelementptr inbounds <2 x double>* %sret_slot, i32 0, i32 0
  store double 0.000000e+00, double* %2
  %3 = getelementptr inbounds <2 x double>* %sret_slot, i32 0, i32 1
  store double 1.000000e+00, double* %3
  %4 = load <2 x double>* %sret_slot
  ret <2 x double> %4
}

After optimisations it becomes

; Function Attrs: nounwind readnone uwtable
define <2 x double> @_ZN3foo20h36a71d373a6347d3daaE(double, double) unnamed_addr #0 {
entry-block:
  ret <2 x double> <double 0.000000e+00, double 1.000000e+00>
}

We could detect constants in a SIMD initialiser and compile to this directly, making our no-opt code faster, and saving the optimiser work.

@steveklabnik
Copy link
Member

@huonw with std::simd gone, is this still an issue? I haven't been keeping as close an eye on your simd work.

@huonw
Copy link
Member Author

huonw commented Feb 2, 2016

Yes, this applies to any #[repr(simd)] type.

@Mark-Simulacrum Mark-Simulacrum added A-SIMD Area: SIMD (Single Instruction Multiple Data) C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 22, 2017
@nox
Copy link
Contributor

nox commented Mar 31, 2018

Cc @rust-lang/wg-compiler-performance now that SIMD support is going to be stabilised.

@workingjubilee
Copy link
Member

workingjubilee commented Oct 15, 2020

According to Godbolt
rustc +nightly --emit=llvm-ir -Copt-level=0 now gives

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @_ZN7example3foo17h2ac4b13db8a0abecE
  (<2 x double>* noalias nocapture sret dereferenceable(16) %0, double %x, double %y)
  unnamed_addr #0 !dbg !6 {
    %1 = bitcast <2 x double>* %0 to double*, !dbg !10
    store double 0.000000e+00, double* %1, align 16, !dbg !10
    %2 = getelementptr inbounds <2 x double>, <2 x double>* %0, i32 0, i32 1, !dbg !10
    store double 1.000000e+00, double* %2, align 8, !dbg !10
    ret void, !dbg !11
}

attributes #0 = { nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }

!llvm.module.flags = !{!0, !1, !2}
!llvm.dbg.cu = !{!3}

and -Copt-level=3 now gives

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @_ZN7example3foo17h2ac4b13db8a0abecE
  (<2 x double>* noalias nocapture sret dereferenceable(16) %0, double %x, double %y)
  unnamed_addr #0 !dbg !6 {
    store <2 x double> <double 0.000000e+00, double 1.000000e+00>, <2 x double>* %0, align 16, !dbg !10
    ret void, !dbg !11
}

attributes #0 = { nofree norecurse nounwind nonlazybind uwtable writeonly
"probe-stack"="__rust_probestack" "target-cpu"="x86-64" }

I'm... not sure how to read what seem like significant syntactic differences in these IR outputs, but I have a hunch that the result is "no change".

@workingjubilee
Copy link
Member

Hm... With new eyes, I can affirm. No change.
Soluble, though.

@bors bors closed this as completed in 540891b Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-SIMD Area: SIMD (Single Instruction Multiple Data) C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants