Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISC-V] Intrinsic has incorrect return type! (@llvm.vp.fcmp.nxv2f32, @llvm.vp.select.nxv2i32) #7173

Closed
dkurt opened this issue Nov 21, 2022 · 9 comments

Comments

@dkurt
Copy link
Contributor

dkurt commented Nov 21, 2022

Hi, @zvookin! Trying RISC-V, got the following error on specific algorithm of Max pooling deep learning layer. Reproduced only with .vectorize applied.

LLVM: 15.0.2

Reproducer
#include "Halide.h"

using namespace Halide;

int main(int argc, char** argv) {
    Func top;
    Var x("x"), y("y"), c("c"), n("n");

    Buffer<float> input(1, 96, 55, 55);

    Halide::RDom r(0, 2, 0, 2);
    Halide::Expr kx, ky;
    kx = min(x * 2 + r.x, 54);
    ky = min(y * 2 + r.y, 54);

    Halide::Tuple res = argmax(input(kx, ky, c, n));
    top(x, y, c, n) = res[2];

    top.bound(x, 0, 27)
       .bound(y, 0, 27)
       .bound(c, 0, 96)
       .bound(n, 0, 1);

    top.vectorize(x, 8);

    // Scheduling
    Target target = get_host_target();
    target.vector_bits = 8 * sizeof(float) * 8;

    std::vector<Target::Feature> features;
    features.push_back(Target::RVV);
    target.set_features(features);

    std::cout << target << std::endl;

    top.print_loop_nest();

    try {
        top.compile_to_static_library("compiled", {}, "pooling", target);
    } catch(Halide::InternalError& ex) {
        std::cout << ex.what() << std::endl;
    }
    return 0;
}
target(riscv-64-linux-rvv-vector_bits_256)
produce f0:
  for c in [0, 95]:
    for y in [0, 26]:
      for x.x in [0, 3]:
        vectorized x.v2 in [0, 7]:
          produce argmax:
            argmax(...) = ...
            for r4 in [0, 1]:
              for r4 in [0, 1]:
                argmax(...) = ...
          consume argmax:
            f0(...) = ...
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2i32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2i32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2f32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2i32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2i32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2f32
Internal Error at /home/dkurt/Halide/src/CodeGen_LLVM.cpp:632 triggered by user code at : Condition failed: !verifyFunction(*function, &llvm::errs()):

If I use just top(x, y, c, n) = maximum(input(kx, ky, c, n)):

Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2f32
Intrinsic has incorrect return type!
ptr @llvm.vp.fcmp.nxv2f32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv2f3
@dkurt
Copy link
Contributor Author

dkurt commented Nov 28, 2022

Intrinsic has incorrect return type!
ptr @llvm.vp.icmp.nxv2i32

If I understand correctly, icmp return type should be i1 but not i32: https://llvm.org/docs/LangRef.html#llvm-vp-icmp-intrinsics

currently generated by Halide:

declare <vscale x 2 x i32> @llvm.vp.icmp.nxv2i32(<vscale x 2 x i32>, <vscale x 2 x i32>, metadata, <vscale x 2 x i1>, i32) #10

but not

declare <vscale x 2 x i1> @llvm.vp.icmp.nxv2i32(<vscale x 2 x i32>, <vscale x 2 x i32>, metadata, <vscale x 2 x i1>, i32) #10

Update: see proposal in #7183

@dkurt
Copy link
Contributor Author

dkurt commented Nov 30, 2022

One more similar error:

Intrinsic has incorrect return type!
ptr @llvm.riscv.vaadd.i64.i64.i64
Intrinsic has incorrect return type!
ptr @llvm.riscv.vaaddu.i64.i64.i64
Intrinsic has incorrect return type!
ptr @llvm.riscv.vaadd.i64.i64.i64
Intrinsic has incorrect return type!
ptr @llvm.riscv.vaaddu.i64.i64.i64
Reproducer (RGB to Grayscale)
#include <Halide.h>

using namespace Halide;

const int width = 640;
const int height = 480;

int main(int argc, char** argv) {
    uint16_t R2GRAY = 77, G2GRAY = 150, B2GRAY = 29;

    std::vector<uint8_t> src(width * height * 3);
    std::vector<uint8_t> dst(width * height);

    Func f("rgb2gray");
    auto input = Buffer<uint8_t>::make_interleaved(src.data(), width, height, 3);

    Var x("x"), y("y");
    Expr r = cast<uint16_t>(input(x, y, 0));
    Expr g = cast<uint16_t>(input(x, y, 1));
    Expr b = cast<uint16_t>(input(x, y, 2));
    f(x, y) = cast<uint8_t>((R2GRAY * r + G2GRAY * g + B2GRAY * b) >> 8);

    Var yo("yo"), yi("yi");
    f.split(y, yo, yi, 64)
     .parallel(yo)
     .vectorize(x, 8);

    Target target = get_host_target();
    target.vector_bits = 8 * sizeof(uint8_t) * 8;

    std::vector<Target::Feature> features;
    features.push_back(Target::RVV);
    target.set_features(features);

    std::cout << target << std::endl;

    f.print_loop_nest();

    Buffer<uint8_t> output(dst.data(), {width, height});
    f.realize(output, target);
}

Fails only on log2_of_scale=3. Functions print from define_riscv_intrinsic_wrapper before verifyFunction:

---------------------------------------- log2_of_scale=0: int8x8 halving_add
; Function Attrs: alwaysinline nounwind
define internal <vscale x 8 x i8> @halving_add_wrapper(<vscale x 8 x i8> %0, <vscale x 8 x i8> %1) #9 {
entry:
  call void asm sideeffect "csrw vxrm,${0:z}", "rJ,~{memory}"(i64 2)
  %2 = call <vscale x 8 x i8> @llvm.riscv.vaadd.nxv8i8.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %0, <vscale x 8 x i8> %1, i64 8)
  ret <vscale x 8 x i8> %2
}

---------------------------------------- log2_of_scale=1: int16x4 halving_add

; Function Attrs: alwaysinline nounwind
define internal <vscale x 4 x i16> @"halving_add_wrapper$1"(<vscale x 4 x i16> %0, <vscale x 4 x i16> %1) #9 {
entry:
  call void asm sideeffect "csrw vxrm,${0:z}", "rJ,~{memory}"(i64 2)
  %2 = call <vscale x 4 x i16> @llvm.riscv.vaadd.nxv4i16.nxv4i16.i64(<vscale x 4 x i16> undef, <vscale x 4 x i16> %0, <vscale x 4 x i16> %1, i64 4)
  ret <vscale x 4 x i16> %2
}

----------------------------------------  log2_of_scale=2: int32x2 halving_add
; Function Attrs: alwaysinline nounwind
define internal <vscale x 2 x i32> @"halving_add_wrapper$2"(<vscale x 2 x i32> %0, <vscale x 2 x i32> %1) #9 {
entry:
  call void asm sideeffect "csrw vxrm,${0:z}", "rJ,~{memory}"(i64 2)
  %2 = call <vscale x 2 x i32> @llvm.riscv.vaadd.nxv2i32.nxv2i32.i64(<vscale x 2 x i32> undef, <vscale x 2 x i32> %0, <vscale x 2 x i32> %1, i64 2)
  ret <vscale x 2 x i32> %2
}

---------------------------------------- log2_of_scale=3: int64 halving_add
; Function Attrs: alwaysinline nounwind
define internal i64 @"halving_add_wrapper$3"(i64 %0, i64 %1) #9 {
entry:
  call void asm sideeffect "csrw vxrm,${0:z}", "rJ,~{memory}"(i64 2)
  %2 = call i64 @llvm.riscv.vaadd.i64.i64.i64(i64 undef, i64 %0, i64 %1, i64 1)
  ret i64 %2
}

Is <vscale x 1 x i64> expected?

See proposal at #7192

@miles-rusch-berkeley
Copy link

miles-rusch-berkeley commented Dec 6, 2022

Hi, @zvookin, I am getting another similar error when vectorizing and using the function BoundaryConditions::repeat_edge(Input)(x, y)

 % g++ gaussian_blur.cpp -g -I includeHalide -I toolsHalide -L libHalide -lHalide -o lesson_01 -std=c++17 -lpthread -ldl
% DYLD_LIBRARY_PATH=libHalide ./lesson_01     
...          
Intrinsic has incorrect return type!
ptr @llvm.vp.icmp.nxv4i32
Intrinsic has incorrect argument type!
ptr @llvm.vp.select.nxv4i32
Internal Error at /Users/milesrusch/Documents/risc-v/rvv/llvm15/Halide/src/CodeGen_LLVM.cpp:632 triggered by user code at :
Condition failed: !verifyFunction(*function, &llvm::errs()):
zsh: abort      DYLD_LIBRARY_PATH=libHalide ./lesson_01

file that reproduces error with the above commands attached:
gaussian_blur.cpp.zip

@dkurt
Copy link
Contributor Author

dkurt commented Dec 6, 2022

@miles-rusch-berkeley, in llvm/llvm-project#59252 it's recommended to add +zvl256b flag.

@miles-rusch-berkeley
Copy link

Hi @dkurt , thanks for your help. Where can I add this flag? Right now I am just using g++ to compile my pipeline to a static library, and I don't see how to control the llc command. Do I have to use Halide's compile_to_llvm_assembly() or specify the zvl flag in Halide::Internal::CodeGen_LLVM::target?

@zvookin
Copy link
Member

zvookin commented Dec 6, 2022

I should have a PR up shortly to fix both the icmp and select issues. The zvl flag would get added in CodeGen_RISCV::mattrs. It doesn't seem to fix the select issue, but I'm still debugging.

@zvookin
Copy link
Member

zvookin commented Dec 6, 2022

Fix is in #7205 .

@zvookin
Copy link
Member

zvookin commented Dec 7, 2022

Should be fixed in main now.

@dkurt
Copy link
Contributor Author

dkurt commented Dec 7, 2022

Thanks, @zvookin! Yes, both exceptions now fixed on the main branch. Remaining select error is

app: /home/dkurt/llvm-project/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp:1732: virtual void llvm::RISCVDAGToDAGISel::Select(llvm::SDNode*): Assertion `RISCVTargetLowering::getRegClassIDForVecVT(SubVecContainerVT) == InRegClassID && "Unexpected subvector extraction"' failed.

I hope llvm/llvm-project#59252 (comment) may help.

@dkurt dkurt closed this as completed Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants