-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL] Refactor the way we handle duplicate attribute logic #3224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3209,65 +3209,148 @@ static void handleWorkGroupSizeHint(Sema &S, Decl *D, const ParsedAttr &AL) { | |
WGSize[1], WGSize[2])); | ||
} | ||
|
||
// Handles intel_reqd_sub_group_size. | ||
static void handleSubGroupSize(Sema &S, Decl *D, const ParsedAttr &AL) { | ||
if (S.LangOpts.SYCLIsHost) | ||
void Sema::AddIntelReqdSubGroupSize(Decl *D, const AttributeCommonInfo &CI, | ||
Expr *E) { | ||
if (LangOpts.SYCLIsHost) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was a review comment for another attribute I worked on - Should we not do any semantic analysis for attributes on host. In my PR, after discussions, I ended up handling diagnostics (even if attribute is ignored) on host. However I think this is not the 'normal' behavior for SYCL/FPGA attributes IIRC. We should be consistent with this and so I think its necessary to discuss this now as we start refactoring attributes. Thoughts? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for raising the question! My gut instinct is: if the attribute should be treated as an unknown attribute when some option isn't set, then the attribute should get a That said, I have no idea how much of these decisions are driven by the fact that dpcpp compiles things three times and so the diagnostics have the potential to come out in triplicate. (FWIW, I think that's a usability issue that should be solved -- we should probably be collating all of the diagnostics into one list and removing duplicates before displaying the diagnostics. If device vs host is important to understanding how to fix a diagnostic, we could emit whether it was a host, device, or both as part of this collated list. Or, if users don't need to do separate device and host compilations on their own, I suppose the ideal would be to generate the AST once (and emit all the diagnostics once), then use the AST three times for codegen purposes (so only codegen diagnostics would potentially be duplicated, but those are few and far between).) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I believe this was the behavior originally. The LangOpts was SyclIsDevice, IIRC. It was modified because this resulted in 'unknown attribute' warnings during host compilation phase of triple compilation. For a user unaware of the 3 passes, the diagnostic is just confusing. So we added LangOpts SYCLIsHost to silently ignore the attribute on host. Thinking about it now, my question above isn't really valid in triple compilation since device compilation will generate the warnings anyway. The only thing enabling diagnostics on host will do is generate multiple diagnostics. The issue only arises if someone does host and device compilation separately as you mentioned.
To be honest I don't really know. @erichkeane @premanandrao could you please weigh in?
This makes sense. Off the top of my head I'm not sure how much work is required to do this though. We could potentially discuss this with the architects and work on this for a future release. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, so the reason we never checked in 'host' mode was: 1- We don't want these attributes to make it into the AST when they aren't meaningful/don't do anything. This allows us to not bother checking SyclIsDevice later in the process (which is consistent with some other Clang arch bits). 2- We do not want them to diagnose in Host mode of course, since then the 3-compiles ends up warning on something that isn't the case. 3- We always compile 3 times, so by the time we get to host-mode (the 3rd one), any errors/warnings have been handled already. In general, I would be OK MOVING the SyclIsHost checks to after the diagnostics (though I don't think it is a particularly high priority item), but I don't think I would want it to be added to the AST, so we would have to make sure we only did the addAttr call in device mode. as far as collating the results, I don't have a good idea how to do that. First, one of the requirements of SYCL is that the 'host' mode compile be able to be ANY compiler, not just a SYCL compiler. That is why there is almost nothing that does host-compiling. Second, They are different cc1 invocations, so different processes. I don't know how we can teach the diagnostics engine to share between different processes. We at one point considered disabling warnings on the host-compile entirely, but we immediately ran into the case where conditional compilation resulted in missed, useful warnings. The duplicate diagnostics is considered a necessary evil at that point :/ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the explanation @erichkeane. Personally I believe it isn't necessary to emit these diagnostics on host since the attributes aren't added anyway. But I am not opposed to it either if @AaronBallman thinks its better we do. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I may regret going down this rabbit hole, but... how does this work with tooling like clang-tidy? e.g., when you run clang-tidy, do you get three compilations there as well? The reason I ask is: that could be a reason why we might want to keep these attributes in the AST -- if clang-tidy only runs one compilation mode and that mode drops the attributes, it'd be hard to write useful SYCL checkers.
Oh, ew. I thought we'd be using the integrated cc1 functionality so that we don't have to pay the price to spawn three separate executables (which is a big perf hit on Windows).
So conditional compilation is a thing that our users will do? If that's the case, then I think we want to issue the diagnostics in host mode, even if we don't want to attach the attribute to the AST. Though, that does make for a bit of awkwardness -- community has a general rule of thumb that any time an attribute is ignored there be a diagnostic issued about ignoring it (otherwise users think their attribute is doing something useful when it's not). If users do separate compilations of host vs device, then it means they could do only the host compilation with no device compilation... so the attribute would look like it does something meaningful when it doesn't. On the flip side, it would be outright baffling if the user did all three compilation modes and had an "ignored attribute" warning from one of them because they'd get confused by the attribute being applied in the other two modes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. First, we haven't really considered clang-tidy, so i actually have no idea! Presumably, it would run in the 'first' compile, which is device mode. As far as the spawning, we ended up finding that doing all 3 in one process is actually an even worse problem, to the point I think we submitted a community patch to only do that integrated CC1 when there was only 1 process to run. You end up running out of resources if you do them all in the same process (in part, because Clang doesn't clean up after itself well, even with -no-disable-free).
This is really what we took into consideration. We opted to suppress the diagnostics for that reason. That said, I think the ONLY way to split the compilation is to do -cc1 directly, so we are/were less worried about it. I could go either way on diagnosing other errors/warnings in host mode, but feel free to do so if you wish (or file a bug for someone to do). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After some offline discussion with @erichkeane and some thinking on it, I have the impression that the vast majority of users will not split their host and device compilations but instead pass Based on that, I think my answer here is to drop the attributes from the AST and not do any diagnostic checking for them when in host mode. I think the best way to do that is from tablegen in Attr.td. We have three WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
To be honest, I thought that was already the case! I think it is the right idea.
I'm not sure I see how that fits into LangOpts?
I like this end-result, and I think it makes a lot of sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, wait... you meant |
||
return; | ||
|
||
Expr *E = AL.getArgAsExpr(0); | ||
if (!E->isValueDependent()) { | ||
// Validate that we have an integer constant expression and then store the | ||
// converted constant expression into the semantic attribute so that we | ||
// don't have to evaluate it again later. | ||
llvm::APSInt ArgVal; | ||
ExprResult Res = VerifyIntegerConstantExpression(E, &ArgVal); | ||
if (Res.isInvalid()) | ||
return; | ||
E = Res.get(); | ||
|
||
if (D->getAttr<IntelReqdSubGroupSizeAttr>()) | ||
S.Diag(AL.getLoc(), diag::warn_duplicate_attribute) << AL; | ||
// This attribute requires a strictly positive value. | ||
if (ArgVal <= 0) { | ||
Diag(E->getExprLoc(), diag::err_attribute_requires_positive_integer) | ||
<< CI << /*positive*/ 0; | ||
return; | ||
} | ||
|
||
// Check to see if there's a duplicate attribute with different values | ||
// already applied to the declaration. | ||
if (const auto *DeclAttr = D->getAttr<IntelReqdSubGroupSizeAttr>()) { | ||
// If the other attribute argument is instantiation dependent, we won't | ||
// have converted it to a constant expression yet and thus we test | ||
// whether this is a null pointer. | ||
const auto *DeclExpr = dyn_cast<ConstantExpr>(DeclAttr->getValue()); | ||
if (DeclExpr && ArgVal != DeclExpr->getResultAsAPSInt()) { | ||
Diag(CI.getLoc(), diag::warn_duplicate_attribute) << CI; | ||
Diag(DeclAttr->getLoc(), diag::note_previous_attribute); | ||
return; | ||
} | ||
} | ||
} | ||
|
||
S.addIntelSingleArgAttr<IntelReqdSubGroupSizeAttr>(D, AL, E); | ||
D->addAttr(::new (Context) IntelReqdSubGroupSizeAttr(Context, CI, E)); | ||
} | ||
|
||
// Handles num_simd_work_items. | ||
static void handleNumSimdWorkItemsAttr(Sema &S, Decl *D, const ParsedAttr &AL) { | ||
if (D->isInvalidDecl()) | ||
return; | ||
IntelReqdSubGroupSizeAttr * | ||
Sema::MergeIntelReqdSubGroupSizeAttr(Decl *D, | ||
const IntelReqdSubGroupSizeAttr &A) { | ||
// Check to see if there's a duplicate attribute with different values | ||
// already applied to the declaration. | ||
if (const auto *DeclAttr = D->getAttr<IntelReqdSubGroupSizeAttr>()) { | ||
const auto *DeclExpr = dyn_cast<ConstantExpr>(DeclAttr->getValue()); | ||
const auto *MergeExpr = dyn_cast<ConstantExpr>(A.getValue()); | ||
if (DeclExpr && MergeExpr && | ||
DeclExpr->getResultAsAPSInt() != MergeExpr->getResultAsAPSInt()) { | ||
Diag(DeclAttr->getLoc(), diag::warn_duplicate_attribute) << &A; | ||
Diag(A.getLoc(), diag::note_previous_attribute); | ||
return nullptr; | ||
} | ||
} | ||
return ::new (Context) IntelReqdSubGroupSizeAttr(Context, A, A.getValue()); | ||
} | ||
|
||
static void handleIntelReqdSubGroupSize(Sema &S, Decl *D, | ||
const ParsedAttr &AL) { | ||
Expr *E = AL.getArgAsExpr(0); | ||
S.AddIntelReqdSubGroupSize(D, AL, E); | ||
} | ||
|
||
if (D->getAttr<SYCLIntelNumSimdWorkItemsAttr>()) | ||
S.Diag(AL.getLoc(), diag::warn_duplicate_attribute) << AL; | ||
|
||
S.CheckDeprecatedSYCLAttributeSpelling(AL); | ||
|
||
void Sema::AddSYCLIntelNumSimdWorkItemsAttr(Decl *D, | ||
const AttributeCommonInfo &CI, | ||
Expr *E) { | ||
if (!E->isValueDependent()) { | ||
// Validate that we have an integer constant expression and then store the | ||
// converted constant expression into the semantic attribute so that we | ||
// don't have to evaluate it again later. | ||
llvm::APSInt ArgVal; | ||
ExprResult ICE = S.VerifyIntegerConstantExpression(E, &ArgVal); | ||
|
||
if (ICE.isInvalid()) | ||
ExprResult Res = VerifyIntegerConstantExpression(E, &ArgVal); | ||
if (Res.isInvalid()) | ||
return; | ||
E = Res.get(); | ||
|
||
E = ICE.get(); | ||
int64_t NumSimdWorkItems = ArgVal.getSExtValue(); | ||
|
||
if (NumSimdWorkItems == 0) { | ||
S.Diag(E->getExprLoc(), diag::err_attribute_argument_is_zero) | ||
<< AL << E->getSourceRange(); | ||
// This attribute requires a strictly positive value. | ||
if (ArgVal <= 0) { | ||
Diag(E->getExprLoc(), diag::err_attribute_requires_positive_integer) | ||
<< CI << /*positive*/ 0; | ||
return; | ||
} | ||
|
||
if (const auto *A = D->getAttr<ReqdWorkGroupSizeAttr>()) { | ||
ASTContext &Ctx = S.getASTContext(); | ||
Optional<llvm::APSInt> XDimVal = A->getXDimVal(Ctx); | ||
Optional<llvm::APSInt> YDimVal = A->getYDimVal(Ctx); | ||
Optional<llvm::APSInt> ZDimVal = A->getZDimVal(Ctx); | ||
|
||
if (!(XDimVal->getZExtValue() % NumSimdWorkItems == 0 || | ||
YDimVal->getZExtValue() % NumSimdWorkItems == 0 || | ||
ZDimVal->getZExtValue() % NumSimdWorkItems == 0)) { | ||
S.Diag(AL.getLoc(), diag::err_sycl_num_kernel_wrong_reqd_wg_size) | ||
<< AL << A; | ||
S.Diag(A->getLocation(), diag::note_conflicting_attribute); | ||
// Check to see if there's a duplicate attribute with different values | ||
// already applied to the declaration. | ||
if (const auto *DeclAttr = D->getAttr<SYCLIntelNumSimdWorkItemsAttr>()) { | ||
// If the other attribute argument is instantiation dependent, we won't | ||
// have converted it to a constant expression yet and thus we test | ||
// whether this is a null pointer. | ||
const auto *DeclExpr = dyn_cast<ConstantExpr>(DeclAttr->getValue()); | ||
if (DeclExpr && ArgVal != DeclExpr->getResultAsAPSInt()) { | ||
Diag(CI.getLoc(), diag::warn_duplicate_attribute) << CI; | ||
Diag(DeclAttr->getLoc(), diag::note_previous_attribute); | ||
return; | ||
} | ||
} | ||
|
||
// If the declaration has an [[intel::reqd_work_group_size]] attribute, | ||
// check to see if can be evenly divided by the num_simd_work_items attr. | ||
if (const auto *DeclAttr = D->getAttr<ReqdWorkGroupSizeAttr>()) { | ||
Optional<llvm::APSInt> XDimVal = DeclAttr->getXDimVal(Context); | ||
Optional<llvm::APSInt> YDimVal = DeclAttr->getYDimVal(Context); | ||
Optional<llvm::APSInt> ZDimVal = DeclAttr->getZDimVal(Context); | ||
|
||
if (!(*XDimVal % ArgVal == 0 || *YDimVal % ArgVal == 0 || | ||
*ZDimVal % ArgVal == 0)) { | ||
AaronBallman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Diag(CI.getLoc(), diag::err_sycl_num_kernel_wrong_reqd_wg_size) | ||
<< CI << DeclAttr; | ||
Diag(DeclAttr->getLocation(), diag::note_conflicting_attribute); | ||
return; | ||
} | ||
} | ||
} | ||
|
||
S.addIntelSingleArgAttr<SYCLIntelNumSimdWorkItemsAttr>(D, AL, E); | ||
D->addAttr(::new (Context) SYCLIntelNumSimdWorkItemsAttr(Context, CI, E)); | ||
} | ||
|
||
SYCLIntelNumSimdWorkItemsAttr *Sema::MergeSYCLIntelNumSimdWorkItemsAttr( | ||
Decl *D, const SYCLIntelNumSimdWorkItemsAttr &A) { | ||
// Check to see if there's a duplicate attribute with different values | ||
// already applied to the declaration. | ||
if (const auto *DeclAttr = D->getAttr<SYCLIntelNumSimdWorkItemsAttr>()) { | ||
const auto *DeclExpr = dyn_cast<ConstantExpr>(DeclAttr->getValue()); | ||
const auto *MergeExpr = dyn_cast<ConstantExpr>(A.getValue()); | ||
if (DeclExpr && MergeExpr && | ||
DeclExpr->getResultAsAPSInt() != MergeExpr->getResultAsAPSInt()) { | ||
Diag(DeclAttr->getLoc(), diag::warn_duplicate_attribute) << &A; | ||
Diag(A.getLoc(), diag::note_previous_attribute); | ||
return nullptr; | ||
} | ||
} | ||
return ::new (Context) | ||
SYCLIntelNumSimdWorkItemsAttr(Context, A, A.getValue()); | ||
} | ||
|
||
static void handleSYCLIntelNumSimdWorkItemsAttr(Sema &S, Decl *D, | ||
const ParsedAttr &A) { | ||
S.CheckDeprecatedSYCLAttributeSpelling(A); | ||
|
||
Expr *E = A.getArgAsExpr(0); | ||
S.AddSYCLIntelNumSimdWorkItemsAttr(D, A, E); | ||
} | ||
|
||
// Handles use_stall_enable_clusters | ||
|
@@ -8848,10 +8931,10 @@ static void ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D, | |
handleWorkGroupSize<SYCLIntelMaxWorkGroupSizeAttr>(S, D, AL); | ||
break; | ||
case ParsedAttr::AT_IntelReqdSubGroupSize: | ||
handleSubGroupSize(S, D, AL); | ||
handleIntelReqdSubGroupSize(S, D, AL); | ||
break; | ||
case ParsedAttr::AT_SYCLIntelNumSimdWorkItems: | ||
handleNumSimdWorkItemsAttr(S, D, AL); | ||
handleSYCLIntelNumSimdWorkItemsAttr(S, D, AL); | ||
break; | ||
case ParsedAttr::AT_SYCLIntelSchedulerTargetFmaxMhz: | ||
handleSchedulerTargetFmaxMhzAttr(S, D, AL); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change wasn't strictly necessary in this patch, but I added it as a drive-by fix because I was already touching this attribute. It's an effectively NFC change (the pragma subject test had to be updated but the functionality is identical).
You don't need to add
CXXMethod
to the list becauseCXXMethodDecl
already inherits fromFunctionDecl
, so the existingFunction
subject covers both cases.