Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End to end support for bfp16 scl2vec intrinsics #278

Open
wants to merge 2 commits into
base: aie-public
Choose a base branch
from

Conversation

niwinanto
Copy link
Collaborator

TODO:
v128bfp16ebs8 shuffle(v128bfp16ebs8 , unsigned int );
This depends on intrinsics from bfp16 upd_ext

@@ -13,3 +13,4 @@

include "AIEBaseRegisterBanks.td"
def AccRegBank : RegisterBank<"AccRegBank", [ACC512, ACC1024, ACC2048]>;
def GPRRegBank : RegisterBank<"GPRRegBank", [eR, eL, eE, EXPVEC64]>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, include a new line in the end.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be needed after rebase

Copy link
Collaborator

@konstantinschwarz konstantinschwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, few nits

@@ -246,6 +248,7 @@ typedef int16_t v4int16 __attribute__((__vector_size__(8)));
typedef uint16_t v4uint16 __attribute__((__vector_size__(8)));
typedef uint8_t v8uint8 __attribute__((__vector_size__(8)));
typedef int8_t v8int8 __attribute__((__vector_size__(8)));
typedef char v8char __attribute__((__vector_size__(8)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for these two types, just use v64int8 and v8int8?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are defined in the global header now as V8c and V64c. Could you use just one or the other after rebasing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And they are need to be able to cast as the builtins take a v64c (a vector of char) and we define v64int8/v8int8 as a vectors of int8_t

@@ -572,4 +572,10 @@ def int_aie2p_sqrtf : ClangBuiltin<"__builtin_aie2p_sqrtf">, AIE2PNLF;
// DIVS
def int_aie2p_divs : AIE2PDIVS;

// BFP16 MAC MUL
class AIE2PSHUFFLEBFP16
: Intrinsic<[llvm_v64i8_ty, llvm_v8i8_ty], [llvm_v64i8_ty, llvm_v8i8_ty, llvm_v64i8_ty, llvm_v8i8_ty, llvm_i32_ty],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use DefaultAttrsIntrinsic instead of Intrinsic

class AIE2PSHUFFLEBFP16
: Intrinsic<[llvm_v64i8_ty, llvm_v8i8_ty], [llvm_v64i8_ty, llvm_v8i8_ty, llvm_v64i8_ty, llvm_v8i8_ty, llvm_i32_ty],
[IntrNoMem]>;
def int_aie2p_vshuffle_576_bfp16 : AIE2PSHUFFLEBFP16;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This selects to the same instruction whether we come from v64bfp16ebs8 (aka 576 size) or v64bfp16ebs16 (aka 544 size) so I would just name it aie2p_vshuffle_bfp16

const RegClassOrRegBank &RegClassOrBank = MRI.getRegClassOrRegBank(DstReg);
const TargetRegisterClass *DstRC =
RegClassOrBank.dyn_cast<const TargetRegisterClass *>();
if (!DstRC) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if this is needed, could you put it in its own commit along with the tests that it affects?

Copy link
Collaborator

@khallouh khallouh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add .ll test from IR to assembly?

@@ -300,3 +300,6 @@ BUILTIN(__builtin_aie2p_tanh, "V16yV16g", "nc")

//division/mod
BUILTIN(__builtin_aie2p_divstep, "vUi&Ui&Ui", "nc")

// SHUFFLE
BUILTIN(__builtin_aie2p_vshuffle_576_bfp16, "vV64cV8cV64cV8ciV64c&V8c&", "nc")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as for the llvm intrinsic, we don't need the 576 in the name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants