[RISC-V] Intrinsic has incorrect return type! (@llvm.vp.fcmp.nxv2f32, @llvm.vp.select.nxv2i32) #7173

dkurt · 2022-11-21T14:47:36Z

Hi, @zvookin! Trying RISC-V, got the following error on specific algorithm of Max pooling deep learning layer. Reproduced only with .vectorize applied.

LLVM: 15.0.2

Reproducer

#include "Halide.h"

using namespace Halide;

int main(int argc, char** argv) {
    Func top;
    Var x("x"), y("y"), c("c"), n("n");

    Buffer<float> input(1, 96, 55, 55);

    Halide::RDom r(0, 2, 0, 2);
    Halide::Expr kx, ky;
    kx = min(x * 2 + r.x, 54);
    ky = min(y * 2 + r.y, 54);

    Halide::Tuple res = argmax(input(kx, ky, c, n));
    top(x, y, c, n) = res[2];

    top.bound(x, 0, 27)
       .bound(y, 0, 27)
       .bound(c, 0, 96)
       .bound(n, 0, 1);

    top.vectorize(x, 8);

    // Scheduling
    Target target = get_host_target();
    target.vector_bits = 8 * sizeof(float) * 8;

    std::vector<Target::Feature> features;
    features.push_back(Target::RVV);
    target.set_features(features);

    std::cout << target << std::endl;

    top.print_loop_nest();

    try {
        top.compile_to_static_library("compiled", {}, "pooling", target);
    } catch(Halide::InternalError& ex) {
        std::cout << ex.what() << std::endl;
    }
    return 0;
}

target(riscv-64-linux-rvv-vector_bits_256)
produce f0:
  for c in [0, 95]:
    for y in [0, 26]:
      for x.x in [0, 3]:
        vectorized x.v2 in [0, 7]:
          produce argmax:
            argmax(...) = ...
            for r4 in [0, 1]:
              for r4 in [0, 1]:
                argmax(...) = ...
          consume argmax:
            f0(...) = ...
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2i32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2i32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2f32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2i32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2i32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2f32
Internal Error at /home/dkurt/Halide/src/CodeGen_LLVM.cpp:632 triggered by user code at : Condition failed: !verifyFunction(*function, &llvm::errs()):

If I use just top(x, y, c, n) = maximum(input(kx, ky, c, n)):

Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2f32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2f3

The text was updated successfully, but these errors were encountered:

dkurt · 2022-11-28T15:35:06Z

Intrinsic has incorrect return type!
ptr @llvm.vp.icmp.nxv2i32

If I understand correctly, icmp return type should be i1 but not i32: https://llvm.org/docs/LangRef.html#llvm-vp-icmp-intrinsics

currently generated by Halide:

declare <vscale x 2 x i32> @llvm.vp.icmp.nxv2i32(<vscale x 2 x i32>, <vscale x 2 x i32>, metadata, <vscale x 2 x i1>, i32) #10

but not

declare <vscale x 2 x i1> @llvm.vp.icmp.nxv2i32(<vscale x 2 x i32>, <vscale x 2 x i32>, metadata, <vscale x 2 x i1>, i32) #10

Update: see proposal in #7183

dkurt · 2022-11-30T10:00:22Z

One more similar error:

Intrinsic has incorrect return type!
ptr @llvm.riscv.vaadd.i64.i64.i64
Intrinsic has incorrect return type!
ptr @llvm.riscv.vaaddu.i64.i64.i64
Intrinsic has incorrect return type!
ptr @llvm.riscv.vaadd.i64.i64.i64
Intrinsic has incorrect return type!
ptr @llvm.riscv.vaaddu.i64.i64.i64

Reproducer (RGB to Grayscale)

#include <Halide.h>

using namespace Halide;

const int width = 640;
const int height = 480;

int main(int argc, char** argv) {
    uint16_t R2GRAY = 77, G2GRAY = 150, B2GRAY = 29;

    std::vector<uint8_t> src(width * height * 3);
    std::vector<uint8_t> dst(width * height);

    Func f("rgb2gray");
    auto input = Buffer<uint8_t>::make_interleaved(src.data(), width, height, 3);

    Var x("x"), y("y");
    Expr r = cast<uint16_t>(input(x, y, 0));
    Expr g = cast<uint16_t>(input(x, y, 1));
    Expr b = cast<uint16_t>(input(x, y, 2));
    f(x, y) = cast<uint8_t>((R2GRAY * r + G2GRAY * g + B2GRAY * b) >> 8);

    Var yo("yo"), yi("yi");
    f.split(y, yo, yi, 64)
     .parallel(yo)
     .vectorize(x, 8);

    Target target = get_host_target();
    target.vector_bits = 8 * sizeof(uint8_t) * 8;

    std::vector<Target::Feature> features;
    features.push_back(Target::RVV);
    target.set_features(features);

    std::cout << target << std::endl;

    f.print_loop_nest();

    Buffer<uint8_t> output(dst.data(), {width, height});
    f.realize(output, target);
}

Fails only on log2_of_scale=3. Functions print from define_riscv_intrinsic_wrapper before verifyFunction:

---------------------------------------- log2_of_scale=0: int8x8 halving_add
; Function Attrs: alwaysinline nounwind
define internal <vscale x 8 x i8> @halving_add_wrapper(<vscale x 8 x i8> %0, <vscale x 8 x i8> %1) #9 {
entry:
  call void asm sideeffect "csrw vxrm,${0:z}", "rJ,~{memory}"(i64 2)
  %2 = call <vscale x 8 x i8> @llvm.riscv.vaadd.nxv8i8.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %0, <vscale x 8 x i8> %1, i64 8)
  ret <vscale x 8 x i8> %2
}

---------------------------------------- log2_of_scale=1: int16x4 halving_add

; Function Attrs: alwaysinline nounwind
define internal <vscale x 4 x i16> @"halving_add_wrapper$1"(<vscale x 4 x i16> %0, <vscale x 4 x i16> %1) #9 {
entry:
  call void asm sideeffect "csrw vxrm,${0:z}", "rJ,~{memory}"(i64 2)
  %2 = call <vscale x 4 x i16> @llvm.riscv.vaadd.nxv4i16.nxv4i16.i64(<vscale x 4 x i16> undef, <vscale x 4 x i16> %0, <vscale x 4 x i16> %1, i64 4)
  ret <vscale x 4 x i16> %2
}

----------------------------------------  log2_of_scale=2: int32x2 halving_add
; Function Attrs: alwaysinline nounwind
define internal <vscale x 2 x i32> @"halving_add_wrapper$2"(<vscale x 2 x i32> %0, <vscale x 2 x i32> %1) #9 {
entry:
  call void asm sideeffect "csrw vxrm,${0:z}", "rJ,~{memory}"(i64 2)
  %2 = call <vscale x 2 x i32> @llvm.riscv.vaadd.nxv2i32.nxv2i32.i64(<vscale x 2 x i32> undef, <vscale x 2 x i32> %0, <vscale x 2 x i32> %1, i64 2)
  ret <vscale x 2 x i32> %2
}

---------------------------------------- log2_of_scale=3: int64 halving_add
; Function Attrs: alwaysinline nounwind
define internal i64 @"halving_add_wrapper$3"(i64 %0, i64 %1) #9 {
entry:
  call void asm sideeffect "csrw vxrm,${0:z}", "rJ,~{memory}"(i64 2)
  %2 = call i64 @llvm.riscv.vaadd.i64.i64.i64(i64 undef, i64 %0, i64 %1, i64 1)
  ret i64 %2
}

Is <vscale x 1 x i64> expected?

See proposal at #7192

miles-rusch-berkeley · 2022-12-06T06:33:19Z

Hi, @zvookin, I am getting another similar error when vectorizing and using the function BoundaryConditions::repeat_edge(Input)(x, y)

 % g++ gaussian_blur.cpp -g -I includeHalide -I toolsHalide -L libHalide -lHalide -o lesson_01 -std=c++17 -lpthread -ldl
% DYLD_LIBRARY_PATH=libHalide ./lesson_01     
...          
Intrinsic has incorrect return type!
ptr @llvm.vp.icmp.nxv4i32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv4i32
Internal Error at /Users/milesrusch/Documents/risc-v/rvv/llvm15/Halide/src/CodeGen_LLVM.cpp:632 triggered by user code at :
Condition failed: !verifyFunction(*function, &llvm::errs()):
zsh: abort      DYLD_LIBRARY_PATH=libHalide ./lesson_01

file that reproduces error with the above commands attached:
gaussian_blur.cpp.zip

dkurt · 2022-12-06T07:54:31Z

@miles-rusch-berkeley, in llvm/llvm-project#59252 it's recommended to add +zvl256b flag.

miles-rusch-berkeley · 2022-12-06T20:58:43Z

Hi @dkurt , thanks for your help. Where can I add this flag? Right now I am just using g++ to compile my pipeline to a static library, and I don't see how to control the llc command. Do I have to use Halide's compile_to_llvm_assembly() or specify the zvl flag in Halide::Internal::CodeGen_LLVM::target?

zvookin · 2022-12-06T21:28:29Z

I should have a PR up shortly to fix both the icmp and select issues. The zvl flag would get added in CodeGen_RISCV::mattrs. It doesn't seem to fix the select issue, but I'm still debugging.

zvookin · 2022-12-06T22:34:57Z

Fix is in #7205 .

zvookin · 2022-12-07T01:33:48Z

Should be fixed in main now.

dkurt · 2022-12-07T05:47:33Z

Thanks, @zvookin! Yes, both exceptions now fixed on the main branch. Remaining select error is

app: /home/dkurt/llvm-project/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp:1732: virtual void llvm::RISCVDAGToDAGISel::Select(llvm::SDNode*): Assertion `RISCVTargetLowering::getRegClassIDForVecVT(SubVecContainerVT) == InRegClassID && "Unexpected subvector extraction"' failed.

I hope llvm/llvm-project#59252 (comment) may help.

dkurt closed this as completed Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISC-V] Intrinsic has incorrect return type! (@llvm.vp.fcmp.nxv2f32, @llvm.vp.select.nxv2i32) #7173

[RISC-V] Intrinsic has incorrect return type! (@llvm.vp.fcmp.nxv2f32, @llvm.vp.select.nxv2i32) #7173

dkurt commented Nov 21, 2022 •

edited

Loading

dkurt commented Nov 28, 2022 •

edited

Loading

dkurt commented Nov 30, 2022 •

edited

Loading

miles-rusch-berkeley commented Dec 6, 2022 •

edited

Loading

dkurt commented Dec 6, 2022

miles-rusch-berkeley commented Dec 6, 2022

zvookin commented Dec 6, 2022

zvookin commented Dec 6, 2022

zvookin commented Dec 7, 2022

dkurt commented Dec 7, 2022

[RISC-V] Intrinsic has incorrect return type! (@llvm.vp.fcmp.nxv2f32, @llvm.vp.select.nxv2i32) #7173

[RISC-V] Intrinsic has incorrect return type! (@llvm.vp.fcmp.nxv2f32, @llvm.vp.select.nxv2i32) #7173

Comments

dkurt commented Nov 21, 2022 • edited Loading

dkurt commented Nov 28, 2022 • edited Loading

dkurt commented Nov 30, 2022 • edited Loading

miles-rusch-berkeley commented Dec 6, 2022 • edited Loading

dkurt commented Dec 6, 2022

miles-rusch-berkeley commented Dec 6, 2022

zvookin commented Dec 6, 2022

zvookin commented Dec 6, 2022

zvookin commented Dec 7, 2022

dkurt commented Dec 7, 2022

dkurt commented Nov 21, 2022 •

edited

Loading

dkurt commented Nov 28, 2022 •

edited

Loading

dkurt commented Nov 30, 2022 •

edited

Loading

miles-rusch-berkeley commented Dec 6, 2022 •

edited

Loading