Cannot bit shift vec of bool #53

sjkelly · 2019-07-17T17:02:45Z

Julia scalar:

julia> typeof(true)
Bool

julia> true << 0x03
8

SIMD.jl:

julia> using SIMD

julia> Vec((true,true,false,true)) << Vec((0x01,0x02,0x03,0x04))
ERROR: InexactError: Bool(8)
Stacktrace:
 [1] Type at ./float.jl:73 [inlined]
 [2] llvmconst(::Int64, ::Type{Bool}, ::Int64) at /home/steve/.julia/dev/SIMD/src/SIMD.jl:277
 [3] #s17#41 at /home/steve/.julia/dev/SIMD/src/SIMD.jl:859 [inlined]
 [4] #s17#41(::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any) at ./none:0
 [5] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any,N} where N) at ./boot.jl:524
 [6] <<(::Vec{4,Bool}, ::Vec{4,UInt8}) at /home/steve/.julia/dev/SIMD/src/SIMD.jl:1030
 [7] top-level scope at REPL[20]:1

I have tried this with all Int types and it does not seem to work. It does however work when converting, but this generates less than ideal code.

I am just starting to look at the code here and I do not know if this is related:

SIMD.jl/src/SIMD.jl

Line 227 in 1245797

llvmtype(::Type{Bool}) = "i8" # Julia represents Tuple{Bool} as [1 x i8]

It seems that the shift operator calls this llvmconst where it is initialized as i1 rather than i8 where julia uses i8 for Bool.

SIMD.jl/src/SIMD.jl

Lines 266 to 270 in 1245797

 function llvmconst(::Type{Bool}, val) 

 Bool(val) === false && return "zeroinitializer" 

 typ = "i1" 

 "$typ $(Int(val))" 

 end

The text was updated successfully, but these errors were encountered:

eschnett · 2019-07-17T18:13:21Z

Booleans do not support bitshift operations. When you call true << 0x03, then true is implicitly converted to Int first. Since conversions can be expensive for SIMD values, there is no automatic conversion (promotion) implemented in SIMD.jl.

You can convert the vector yourself, choosing the target integer size:

f8(x::Vec{4, Bool}) = convert(Vec{4, Int8}, x) << Vec((0x01,0x02,0x03,0x04))
f16(x::Vec{4, Bool}) = convert(Vec{4, Int16}, x) << Vec((0x01,0x02,0x03,0x04))

The first (using Int8) is efficient, as the conversion requires only a single machine instruction (and i8 1):

define { <4 x i8> } @julia_f8_16841({ <4 x i8> } addrspace(11)* nocapture nonnull readonly dereferenceable(4)) {
top:
          %1 = getelementptr inbounds { <4 x i8> }, { <4 x i8> } addrspace(11)* %0, i64 0, i32 0
          %2 = load <4 x i8>, <4 x i8> addrspace(11)* %1, align 1
          %3 = and <4 x i8> %2, <i8 1, i8 1, i8 1, i8 1>
     %tmp.i = shl <4 x i8> %3, <i8 1, i8 2, i8 3, i8 4>
  %.fca.0.insert = insertvalue { <4 x i8> } undef, <4 x i8> %tmp.i, 0
  ret { <4 x i8> } %.fca.0.insert
}

The second (using Int16) is much more expensive:

define { <4 x i16> } @julia_f16_16842({ <4 x i8> } addrspace(11)* nocapture nonnull readonly dereferenceable(4)) {
        %1 = getelementptr inbounds { <4 x i8> }, { <4 x i8> } addrspace(11)* %0, i64 0, i32 0, i64 0
          %2 = load i8, i8 addrspace(11)* %1, align 1
          %3 = zext i8 %2 to i16
        %4 = getelementptr inbounds { <4 x i8> }, { <4 x i8> } addrspace(11)* %0, i64 0, i32 0, i64 1
          %5 = load i8, i8 addrspace(11)* %4, align 1
          %6 = zext i8 %5 to i16
        %7 = getelementptr inbounds { <4 x i8> }, { <4 x i8> } addrspace(11)* %0, i64 0, i32 0, i64 2
          %8 = load i8, i8 addrspace(11)* %7, align 1
          %9 = zext i8 %8 to i16
        %10 = getelementptr inbounds { <4 x i8> }, { <4 x i8> } addrspace(11)* %0, i64 0, i32 0, i64 3
          %11 = load i8, i8 addrspace(11)* %10, align 1
          %12 = zext i8 %11 to i16
     %13 = insertelement <4 x i16> undef, i16 %3, i32 0
     %14 = insertelement <4 x i16> %13, i16 %6, i32 1
     %15 = insertelement <4 x i16> %14, i16 %9, i32 2
     %16 = insertelement <4 x i16> %15, i16 %12, i32 3
     %tmp.i = shl <4 x i16> %16, <i16 1, i16 2, i16 3, i16 4>
  %.fca.0.insert = insertvalue { <4 x i16> } undef, <4 x i16> %tmp.i, 0
  ret { <4 x i16> } %.fca.0.insert
}

One could argue that SIMD should auto-convert Bool to Int8 here, but I think that would only add confusion and would not be very useful in practice. A manual conversion seems the right approach here. (And a proper error message for trying to shift booleans.)

sjkelly · 2019-07-17T23:42:03Z

I understand why explicit is better than implicit here. Fortunately I am using UInt8 and in my updated example the codegen looks very good. Thank you for the explanation.

eschnett · 2019-07-18T01:30:39Z

You should be able to shave off the and instruction if you use Base.bitcast to convert Bool to UInt8. I don't know why Julia inserts the and i8 1 – this looks as if the internal representation of Bool could leave the unused bits undefined, instead of ensuring they are zero at all times.

KristofferC · 2020-02-23T15:12:59Z

Yeah, Bool is not really a computational type and Base just immediately promotes it in arithmetic. So just convert it is the right thing to do.

I tired the f8, f16 examples in #63 and it seems the codegen is good now (no and operation and normal code for i16).

julia> code_llvm(f8, Tuple{Vec{4, Bool}}; debuginfo=:none)

define { <4 x i8> } @julia_f8_17139({ <4 x i8> } addrspace(11)* nocapture nonnull readonly dereferenceable(4)) {
top:
  %1 = getelementptr inbounds { <4 x i8> }, { <4 x i8> } addrspace(11)* %0, i64 0, i32 0
  %2 = load <4 x i8>, <4 x i8> addrspace(11)* %1, align 4
  %3 = shl <4 x i8> %2, <i8 1, i8 2, i8 3, i8 4>
  %.fca.0.insert = insertvalue { <4 x i8> } undef, <4 x i8> %3, 0
  ret { <4 x i8> } %.fca.0.insert
}

julia> code_llvm(f16, Tuple{Vec{4, Bool}}; debuginfo=:none)

define { <4 x i16> } @julia_f16_17156({ <4 x i8> } addrspace(11)* nocapture nonnull readonly dereferenceable(4)) {
top:
  %1 = getelementptr inbounds { <4 x i8> }, { <4 x i8> } addrspace(11)* %0, i64 0, i32 0
  %2 = load <4 x i8>, <4 x i8> addrspace(11)* %1, align 4
  %3 = sext <4 x i8> %2 to <4 x i16>
  %4 = shl <4 x i16> %3, <i16 1, i16 2, i16 3, i16 4>
  %.fca.0.insert = insertvalue { <4 x i16> } undef, <4 x i16> %4, 0
  ret { <4 x i16> } %.fca.0.insert
}

KristofferC closed this as completed Feb 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot bit shift vec of bool #53

Cannot bit shift vec of bool #53

sjkelly commented Jul 17, 2019

eschnett commented Jul 17, 2019

sjkelly commented Jul 17, 2019

eschnett commented Jul 18, 2019

KristofferC commented Feb 23, 2020

Cannot bit shift vec of bool #53

Cannot bit shift vec of bool #53

Comments

sjkelly commented Jul 17, 2019

eschnett commented Jul 17, 2019

sjkelly commented Jul 17, 2019

eschnett commented Jul 18, 2019

KristofferC commented Feb 23, 2020