[Codegen_LLVM] Directly load scalar that we'd load as vector and rein… #6809

LebedevRI · 2022-06-14T20:19:07Z

…terpret

Second (of three) pieces of the load widening puzzle.
Here, the codegen is taught to directly emit scalar loads,
instead of doing vector load and bitcasting it.

Refs. #6801
Refs. #6756
Refs. #6775

LebedevRI

This only has basic (stride=+1/-1) correctness test coverage,
but no tests that would ensure that the load actually got widened.
If we'd do that at halide ir level, it would be obvious to check,
but here i'm not really sure how to do that.

LebedevRI · 2022-06-14T20:23:41Z

test/correctness/bitcast_vector_load_into_scalar.cpp

+bool test_all(Target t) {
+ bool success = true;
+
+ success &= test_with_chunk_type<uint8_t>(t);


There is some kind of failure to simplify for the i8 chunk type somewhere in halide:

let t5 = b0[(store.min.0 + store.s0.x.rebased)*2] store[store.s0.x.rebased] = uint64((uint16)reinterpret(ramp(t5, b0[((store.min.0 + store.s0.x.rebased)*2) + 1] - t5, 2)))

but suddenly it's fine for larger types:

store$1[store$1.s0.x.rebased] = uint64((uint32)reinterpret(b3[ramp((store$1.min.0 + store$1.s0.x.rebased)*4, 1, 4) aligned(4, 0)]))

…terpret Second (of three) pieces of the load widening puzzle. Here, the codegen is taught to directly emit scalar loads, instead of doing vector load and `bitcast`ing it. Refs. halide#6801 Refs. halide#6756 Refs. halide#6775

abadams · 2022-06-14T21:43:39Z

src/CodeGen_LLVM.cpp

+ LoadInst *load = builder->CreateAlignedLoad(
+ llvm_dst, ptr, llvm::Align(l->type.bytes()));
+ // FIXME: can we emit better TBAA for constant indexes here?
+ add_tbaa_metadata(load, l->name, /*index=*/Expr());


I think it's correct to use the original load index here, because the tbaa metadata is all in terms of the allocated type before the reinterpret.

Ah no, wait, of course it is, because we really don't change which bytes we load.

abadams · 2022-06-14T21:47:17Z

The implementation makes sense, aside from my concerns about using the reinterpret intrinsic to change vector lanes, but the test is not great - We don't use vector types in front-end code, and I have no idea what might break if we did. I guess it's OK as a temporary measure, though the weird failure with uint8s is alarming and might indicate that some assumption is being violated somewhere.

LebedevRI · 2022-06-14T22:00:57Z

The implementation makes sense, aside from my concerns about using the reinterpret intrinsic to change vector lanes, but the test is not great - We don't use vector types in front-end code, and I have no idea what might break if we did.

Right. Well, i'm not sure how else to write that test given what's currently available :)

the weird failure with uint8s is alarming and might indicate that some assumption is being violated somewhere.

It's some missed simplification. In good case, it is simplified during Lowering after second simplifcation:,
but clearly not in 2xi8 case...

steven-johnson · 2022-06-27T18:33:14Z

Where does this PR stand?

LebedevRI · 2022-06-27T18:56:02Z

I guess, reinterpret intrinsic first needs to be promoted into a first-class IR node.

steven-johnson · 2022-07-26T00:07:09Z

Is this PR still active?

LebedevRI · 2022-07-26T12:26:17Z

Is this PR still active?

fixing Deinterleave (PR almost ready)
Implementing Reinterpret support in Deinterleave
Finally changing Reinterpret semantics
???
Adding concat_bits/extract_bits intrinsics
Teaching vectorizer to handle concat_bits/extract_bits intrinsics nicely
Narrowing/widening the Func along a dimension (#6756) #6775
[Codegen_LLVM] Directly load scalar that we'd load as vector and rein… #6809

LebedevRI mentioned this pull request Jun 14, 2022

Internal representation for misaligned loads #6801

Closed

LebedevRI commented Jun 14, 2022

View reviewed changes

LebedevRI force-pushed the load-widening-codegen branch from 096e85c to be5cba7 Compare June 14, 2022 21:03

abadams reviewed Jun 14, 2022

View reviewed changes

LebedevRI marked this pull request as draft July 8, 2022 16:01

LebedevRI closed this Aug 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Codegen_LLVM] Directly load scalar that we'd load as vector and rein… #6809

[Codegen_LLVM] Directly load scalar that we'd load as vector and rein… #6809

LebedevRI commented Jun 14, 2022

LebedevRI left a comment

LebedevRI Jun 14, 2022

abadams Jun 14, 2022

LebedevRI Jun 14, 2022

abadams commented Jun 14, 2022

LebedevRI commented Jun 14, 2022

steven-johnson commented Jun 27, 2022

LebedevRI commented Jun 27, 2022

steven-johnson commented Jul 26, 2022

LebedevRI commented Jul 26, 2022 •

edited

Loading

[Codegen_LLVM] Directly load scalar that we'd load as vector and rein… #6809

[Codegen_LLVM] Directly load scalar that we'd load as vector and rein… #6809

Conversation

LebedevRI commented Jun 14, 2022

LebedevRI left a comment

Choose a reason for hiding this comment

LebedevRI Jun 14, 2022

Choose a reason for hiding this comment

abadams Jun 14, 2022

Choose a reason for hiding this comment

LebedevRI Jun 14, 2022

Choose a reason for hiding this comment

abadams commented Jun 14, 2022

LebedevRI commented Jun 14, 2022

steven-johnson commented Jun 27, 2022

LebedevRI commented Jun 27, 2022

steven-johnson commented Jul 26, 2022

LebedevRI commented Jul 26, 2022 • edited Loading

LebedevRI commented Jul 26, 2022 •

edited

Loading