-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End to end support for bfp16 scl2vec intrinsics #278
base: aie-public
Are you sure you want to change the base?
Conversation
@@ -13,3 +13,4 @@ | |||
|
|||
include "AIEBaseRegisterBanks.td" | |||
def AccRegBank : RegisterBank<"AccRegBank", [ACC512, ACC1024, ACC2048]>; | |||
def GPRRegBank : RegisterBank<"GPRRegBank", [eR, eL, eE, EXPVEC64]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, include a new line in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not be needed after rebase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, few nits
@@ -246,6 +248,7 @@ typedef int16_t v4int16 __attribute__((__vector_size__(8))); | |||
typedef uint16_t v4uint16 __attribute__((__vector_size__(8))); | |||
typedef uint8_t v8uint8 __attribute__((__vector_size__(8))); | |||
typedef int8_t v8int8 __attribute__((__vector_size__(8))); | |||
typedef char v8char __attribute__((__vector_size__(8))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for these two types, just use v64int8
and v8int8
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are defined in the global header now as V8c
and V64c
. Could you use just one or the other after rebasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And they are need to be able to cast as the builtins take a v64c (a vector of char) and we define v64int8
/v8int8
as a vectors of int8_t
@@ -572,4 +572,10 @@ def int_aie2p_sqrtf : ClangBuiltin<"__builtin_aie2p_sqrtf">, AIE2PNLF; | |||
// DIVS | |||
def int_aie2p_divs : AIE2PDIVS; | |||
|
|||
// BFP16 MAC MUL | |||
class AIE2PSHUFFLEBFP16 | |||
: Intrinsic<[llvm_v64i8_ty, llvm_v8i8_ty], [llvm_v64i8_ty, llvm_v8i8_ty, llvm_v64i8_ty, llvm_v8i8_ty, llvm_i32_ty], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use DefaultAttrsIntrinsic
instead of Intrinsic
class AIE2PSHUFFLEBFP16 | ||
: Intrinsic<[llvm_v64i8_ty, llvm_v8i8_ty], [llvm_v64i8_ty, llvm_v8i8_ty, llvm_v64i8_ty, llvm_v8i8_ty, llvm_i32_ty], | ||
[IntrNoMem]>; | ||
def int_aie2p_vshuffle_576_bfp16 : AIE2PSHUFFLEBFP16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This selects to the same instruction whether we come from v64bfp16ebs8
(aka 576 size) or v64bfp16ebs16
(aka 544 size) so I would just name it aie2p_vshuffle_bfp16
const RegClassOrRegBank &RegClassOrBank = MRI.getRegClassOrRegBank(DstReg); | ||
const TargetRegisterClass *DstRC = | ||
RegClassOrBank.dyn_cast<const TargetRegisterClass *>(); | ||
if (!DstRC) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if this is needed, could you put it in its own commit along with the tests that it affects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add .ll
test from IR to assembly?
@@ -300,3 +300,6 @@ BUILTIN(__builtin_aie2p_tanh, "V16yV16g", "nc") | |||
|
|||
//division/mod | |||
BUILTIN(__builtin_aie2p_divstep, "vUi&Ui&Ui", "nc") | |||
|
|||
// SHUFFLE | |||
BUILTIN(__builtin_aie2p_vshuffle_576_bfp16, "vV64cV8cV64cV8ciV64c&V8c&", "nc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for the llvm intrinsic, we don't need the 576 in the name
TODO:
v128bfp16ebs8 shuffle(v128bfp16ebs8 , unsigned int );
This depends on intrinsics from bfp16
upd_ext