Skip to content

[SYCL] Refactor the way we handle duplicate attribute logic #3224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion clang/include/clang/Basic/Attr.td
Original file line number Diff line number Diff line change
Expand Up @@ -1393,7 +1393,7 @@ def IntelReqdSubGroupSize: InheritableAttr {
let Spellings = [GNU<"intel_reqd_sub_group_size">,
CXX11<"intel", "reqd_sub_group_size">];
let Args = [ExprArgument<"Value">];
let Subjects = SubjectList<[Function, CXXMethod], ErrorDiag>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change wasn't strictly necessary in this patch, but I added it as a drive-by fix because I was already touching this attribute. It's an effectively NFC change (the pragma subject test had to be updated but the functionality is identical).

You don't need to add CXXMethod to the list because CXXMethodDecl already inherits from FunctionDecl, so the existing Function subject covers both cases.

let Subjects = SubjectList<[Function], ErrorDiag>;
let Documentation = [IntelReqdSubGroupSizeDocs];
let LangOpts = [OpenCL, SYCLIsDevice, SYCLIsHost];
}
Expand Down
16 changes: 12 additions & 4 deletions clang/include/clang/Sema/Sema.h
Original file line number Diff line number Diff line change
Expand Up @@ -10210,6 +10210,16 @@ class Sema final {
template <typename AttrType>
void addIntelTripleArgAttr(Decl *D, const AttributeCommonInfo &CI,
Expr *XDimExpr, Expr *YDimExpr, Expr *ZDimExpr);
void AddIntelReqdSubGroupSize(Decl *D, const AttributeCommonInfo &CI,
Expr *E);
IntelReqdSubGroupSizeAttr *
MergeIntelReqdSubGroupSizeAttr(Decl *D, const IntelReqdSubGroupSizeAttr &A);
void AddSYCLIntelNumSimdWorkItemsAttr(Decl *D, const AttributeCommonInfo &CI,
Expr *E);
SYCLIntelNumSimdWorkItemsAttr *
MergeSYCLIntelNumSimdWorkItemsAttr(Decl *D,
const SYCLIntelNumSimdWorkItemsAttr &A);

/// AddAlignedAttr - Adds an aligned attribute to a particular declaration.
void AddAlignedAttr(Decl *D, const AttributeCommonInfo &CI, Expr *E,
bool IsPackExpansion);
Expand Down Expand Up @@ -13068,16 +13078,14 @@ void Sema::addIntelSingleArgAttr(Decl *D, const AttributeCommonInfo &CI,
return;
E = ICE.get();
int32_t ArgInt = ArgVal.getSExtValue();
if (CI.getParsedKind() == ParsedAttr::AT_IntelReqdSubGroupSize ||
CI.getParsedKind() == ParsedAttr::AT_IntelFPGAMaxReplicates) {
if (CI.getParsedKind() == ParsedAttr::AT_IntelFPGAMaxReplicates) {
if (ArgInt <= 0) {
Diag(E->getExprLoc(), diag::err_attribute_requires_positive_integer)
<< CI << /*positive*/ 0;
return;
}
}
if (CI.getParsedKind() == ParsedAttr::AT_SYCLIntelMaxGlobalWorkDim ||
CI.getParsedKind() == ParsedAttr::AT_SYCLIntelNumSimdWorkItems) {
if (CI.getParsedKind() == ParsedAttr::AT_SYCLIntelMaxGlobalWorkDim) {
if (ArgInt < 0) {
Diag(E->getExprLoc(), diag::err_attribute_requires_positive_integer)
<< CI << /*non-negative*/ 1;
Expand Down
4 changes: 4 additions & 0 deletions clang/lib/Sema/SemaDecl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2618,6 +2618,10 @@ static bool mergeDeclAttribute(Sema &S, NamedDecl *D,
NewAttr = S.mergeEnforceTCBAttr(D, *TCBA);
else if (const auto *TCBLA = dyn_cast<EnforceTCBLeafAttr>(Attr))
NewAttr = S.mergeEnforceTCBLeafAttr(D, *TCBLA);
else if (const auto *A = dyn_cast<IntelReqdSubGroupSizeAttr>(Attr))
NewAttr = S.MergeIntelReqdSubGroupSizeAttr(D, *A);
else if (const auto *A = dyn_cast<SYCLIntelNumSimdWorkItemsAttr>(Attr))
NewAttr = S.MergeSYCLIntelNumSimdWorkItemsAttr(D, *A);
else if (Attr->shouldInheritEvenIfAlreadyPresent() || !DeclHasAttr(D, Attr))
NewAttr = cast<InheritableAttr>(Attr->clone(S.Context));

Expand Down
163 changes: 123 additions & 40 deletions clang/lib/Sema/SemaDeclAttr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3209,65 +3209,148 @@ static void handleWorkGroupSizeHint(Sema &S, Decl *D, const ParsedAttr &AL) {
WGSize[1], WGSize[2]));
}

// Handles intel_reqd_sub_group_size.
static void handleSubGroupSize(Sema &S, Decl *D, const ParsedAttr &AL) {
if (S.LangOpts.SYCLIsHost)
void Sema::AddIntelReqdSubGroupSize(Decl *D, const AttributeCommonInfo &CI,
Expr *E) {
if (LangOpts.SYCLIsHost)
Copy link
Contributor

@elizabethandrews elizabethandrews Feb 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a review comment for another attribute I worked on - Should we not do any semantic analysis for attributes on host. In my PR, after discussions, I ended up handling diagnostics (even if attribute is ignored) on host. However I think this is not the 'normal' behavior for SYCL/FPGA attributes IIRC. We should be consistent with this and so I think its necessary to discuss this now as we start refactoring attributes. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising the question!

My gut instinct is: if the attribute should be treated as an unknown attribute when some option isn't set, then the attribute should get a LangOpts definition in Attr.td (or be a target-specific attribute) and the semantic checking in SemaDeclAttr.cpp should not get any special logic to bail out early beyond what's auto-generated for us. If the attribute is known to a given language mode, then semantic checking should happen for the parts of it that are possible to be checked, and we only elide checks that are senseless. Concretely, this means things like the deprecation, duplicate attribute, and invalid combination of attributes warnings should be diagnosed on host and device, while things that require (say) knowledge of device characteristics that aren't available when doing a host compilation are only checked when doing a device compile. My reason for this is: if you do a host-only compile and it compiles cleanly, you'd be rather surprised to suddenly hear that the attribute you're using is deprecated or conflicts with another attribute when doing a device compile. So I'm assuming that users sometimes find a need to split host vs device compilations and that's why we give them the option. (Is that a faulty assumption?)

That said, I have no idea how much of these decisions are driven by the fact that dpcpp compiles things three times and so the diagnostics have the potential to come out in triplicate. (FWIW, I think that's a usability issue that should be solved -- we should probably be collating all of the diagnostics into one list and removing duplicates before displaying the diagnostics. If device vs host is important to understanding how to fix a diagnostic, we could emit whether it was a host, device, or both as part of this collated list. Or, if users don't need to do separate device and host compilations on their own, I suppose the ideal would be to generate the AST once (and emit all the diagnostics once), then use the AST three times for codegen purposes (so only codegen diagnostics would potentially be duplicated, but those are few and far between).)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising the question!

My gut instinct is: if the attribute should be treated as an unknown attribute when some option isn't set, then the attribute should get a LangOpts definition in Attr.td (or be a target-specific attribute) and the semantic checking in SemaDeclAttr.cpp should not get any special logic to bail out early beyond what's auto-generated for us.

I believe this was the behavior originally. The LangOpts was SyclIsDevice, IIRC. It was modified because this resulted in 'unknown attribute' warnings during host compilation phase of triple compilation. For a user unaware of the 3 passes, the diagnostic is just confusing. So we added LangOpts SYCLIsHost to silently ignore the attribute on host.

Thinking about it now, my question above isn't really valid in triple compilation since device compilation will generate the warnings anyway. The only thing enabling diagnostics on host will do is generate multiple diagnostics. The issue only arises if someone does host and device compilation separately as you mentioned.

If the attribute is known to a given language mode, then semantic checking should happen for the parts of it that are possible to be checked, and we only elide checks that are senseless. Concretely, this means things like the deprecation, duplicate attribute, and invalid combination of attributes warnings should be diagnosed on host and device, while things that require (say) knowledge of device characteristics that aren't available when doing a host compilation are only checked when doing a device compile. My reason for this is: if you do a host-only compile and it compiles cleanly, you'd be rather surprised to suddenly hear that the attribute you're using is deprecated or conflicts with another attribute when doing a device compile. So I'm assuming that users sometimes find a need to split host vs device compilations and that's why we give them the option. (Is that a faulty assumption?)

To be honest I don't really know. @erichkeane @premanandrao could you please weigh in?

That said, I have no idea how much of these decisions are driven by the fact that dpcpp compiles things three times and so the diagnostics have the potential to come out in triplicate. (FWIW, I think that's a usability issue that should be solved -- we should probably be collating all of the diagnostics into one list and removing duplicates before displaying the diagnostics. If device vs host is important to understanding how to fix a diagnostic, we could emit whether it was a host, device, or both as part of this collated list. Or, if users don't need to do separate device and host compilations on their own, I suppose the ideal would be to generate the AST once (and emit all the diagnostics once), then use the AST three times for codegen purposes (so only codegen diagnostics would potentially be duplicated, but those are few and far between).)

This makes sense. Off the top of my head I'm not sure how much work is required to do this though. We could potentially discuss this with the architects and work on this for a future release.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so the reason we never checked in 'host' mode was:

1- We don't want these attributes to make it into the AST when they aren't meaningful/don't do anything. This allows us to not bother checking SyclIsDevice later in the process (which is consistent with some other Clang arch bits).

2- We do not want them to diagnose in Host mode of course, since then the 3-compiles ends up warning on something that isn't the case.

3- We always compile 3 times, so by the time we get to host-mode (the 3rd one), any errors/warnings have been handled already.

In general, I would be OK MOVING the SyclIsHost checks to after the diagnostics (though I don't think it is a particularly high priority item), but I don't think I would want it to be added to the AST, so we would have to make sure we only did the addAttr call in device mode.

as far as collating the results, I don't have a good idea how to do that. First, one of the requirements of SYCL is that the 'host' mode compile be able to be ANY compiler, not just a SYCL compiler. That is why there is almost nothing that does host-compiling. Second, They are different cc1 invocations, so different processes. I don't know how we can teach the diagnostics engine to share between different processes.

We at one point considered disabling warnings on the host-compile entirely, but we immediately ran into the case where conditional compilation resulted in missed, useful warnings. The duplicate diagnostics is considered a necessary evil at that point :/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @erichkeane.

Personally I believe it isn't necessary to emit these diagnostics on host since the attributes aren't added anyway. But I am not opposed to it either if @AaronBallman thinks its better we do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1- We don't want these attributes to make it into the AST when they aren't meaningful/don't do anything. This allows us to not bother checking SyclIsDevice later in the process (which is consistent with some other Clang arch bits).

I may regret going down this rabbit hole, but... how does this work with tooling like clang-tidy? e.g., when you run clang-tidy, do you get three compilations there as well? The reason I ask is: that could be a reason why we might want to keep these attributes in the AST -- if clang-tidy only runs one compilation mode and that mode drops the attributes, it'd be hard to write useful SYCL checkers.

as far as collating the results, I don't have a good idea how to do that. First, one of the requirements of SYCL is that the 'host' mode compile be able to be ANY compiler, not just a SYCL compiler. That is why there is almost nothing that does host-compiling. Second, They are different cc1 invocations, so different processes. I don't know how we can teach the diagnostics engine to share between different processes.

Oh, ew. I thought we'd be using the integrated cc1 functionality so that we don't have to pay the price to spawn three separate executables (which is a big perf hit on Windows).

We at one point considered disabling warnings on the host-compile entirely, but we immediately ran into the case where conditional compilation resulted in missed, useful warnings. The duplicate diagnostics is considered a necessary evil at that point :/

So conditional compilation is a thing that our users will do? If that's the case, then I think we want to issue the diagnostics in host mode, even if we don't want to attach the attribute to the AST. Though, that does make for a bit of awkwardness -- community has a general rule of thumb that any time an attribute is ignored there be a diagnostic issued about ignoring it (otherwise users think their attribute is doing something useful when it's not). If users do separate compilations of host vs device, then it means they could do only the host compilation with no device compilation... so the attribute would look like it does something meaningful when it doesn't. On the flip side, it would be outright baffling if the user did all three compilation modes and had an "ignored attribute" warning from one of them because they'd get confused by the attribute being applied in the other two modes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, we haven't really considered clang-tidy, so i actually have no idea! Presumably, it would run in the 'first' compile, which is device mode.

As far as the spawning, we ended up finding that doing all 3 in one process is actually an even worse problem, to the point I think we submitted a community patch to only do that integrated CC1 when there was only 1 process to run. You end up running out of resources if you do them all in the same process (in part, because Clang doesn't clean up after itself well, even with -no-disable-free).

On the flip side, it would be outright baffling if the user did all three compilation modes and had an "ignored attribute" warning from one of them because they'd get confused by the attribute being applied in the other two modes.

This is really what we took into consideration. We opted to suppress the diagnostics for that reason.

That said, I think the ONLY way to split the compilation is to do -cc1 directly, so we are/were less worried about it. I could go either way on diagnosing other errors/warnings in host mode, but feel free to do so if you wish (or file a bug for someone to do).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some offline discussion with @erichkeane and some thinking on it, I have the impression that the vast majority of users will not split their host and device compilations but instead pass -fsycl to the compiler and get all three compilations at once. So we should design for that use case.

Based on that, I think my answer here is to drop the attributes from the AST and not do any diagnostic checking for them when in host mode.

I think the best way to do that is from tablegen in Attr.td. We have three LangOpt subclasses, SYCL, SYCLIsDevice, and SYCLIsHost, where SYCL and SYCLIsDevice are confusingly checking the same language option. I think we should make SYCL mean "device or host" (getLangOpts().SYCL), and SYCLIs<mode> mean "SYCL is enabled and compiling for " (getLangOpts().SYCL && getLangOpts().SYCLIs<mode>()). Then I think we should modify the LangOpt class to accept a bit saying "don't diagnose the attribute as being unknown if this bit is set" and set that bit for the SYCLIsDevice and SYCLIsHost subclasses. The end result is: any attribute that's available in SYCL is available in both host and device mode or diagnosed as an unknown attribute if not compiling for either host or device mode; any attribute that's available for device or host mode exclusively (rather than both) will be diagnosed in the given mode, but otherwise silently ignored unless not compiling for SYCL at all. I think this work should be done in a follow-up patch.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think we should make SYCL mean "device or host" (getLangOpts().SYCL), and SYCLIs mean "SYCL is enabled and compiling for " (getLangOpts().SYCL && getLangOpts().SYCLIs()).

To be honest, I thought that was already the case! I think it is the right idea.

Then I think we should modify the LangOpt class to accept a bit saying "don't diagnose the attribute as being unknown if this bit is set" and set that bit for the SYCLIsDevice and SYCLIsHost subclasses.

I'm not sure I see how that fits into LangOpts?

The end result is: any attribute that's available in SYCL is available in both host and device mode or diagnosed as an unknown attribute if not compiling for either host or device mode; any attribute that's available for device or host mode exclusively (rather than both) will be diagnosed in the given mode, but otherwise silently ignored unless not compiling for SYCL at all. I think this work should be done in a follow-up patch.

I like this end-result, and I think it makes a lot of sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wait... you meant LangOpt the class in Attr.td, didn't you? Yeah, then I think I agree with everything above.

return;

Expr *E = AL.getArgAsExpr(0);
if (!E->isValueDependent()) {
// Validate that we have an integer constant expression and then store the
// converted constant expression into the semantic attribute so that we
// don't have to evaluate it again later.
llvm::APSInt ArgVal;
ExprResult Res = VerifyIntegerConstantExpression(E, &ArgVal);
if (Res.isInvalid())
return;
E = Res.get();

if (D->getAttr<IntelReqdSubGroupSizeAttr>())
S.Diag(AL.getLoc(), diag::warn_duplicate_attribute) << AL;
// This attribute requires a strictly positive value.
if (ArgVal <= 0) {
Diag(E->getExprLoc(), diag::err_attribute_requires_positive_integer)
<< CI << /*positive*/ 0;
return;
}

// Check to see if there's a duplicate attribute with different values
// already applied to the declaration.
if (const auto *DeclAttr = D->getAttr<IntelReqdSubGroupSizeAttr>()) {
// If the other attribute argument is instantiation dependent, we won't
// have converted it to a constant expression yet and thus we test
// whether this is a null pointer.
const auto *DeclExpr = dyn_cast<ConstantExpr>(DeclAttr->getValue());
if (DeclExpr && ArgVal != DeclExpr->getResultAsAPSInt()) {
Diag(CI.getLoc(), diag::warn_duplicate_attribute) << CI;
Diag(DeclAttr->getLoc(), diag::note_previous_attribute);
return;
}
}
}

S.addIntelSingleArgAttr<IntelReqdSubGroupSizeAttr>(D, AL, E);
D->addAttr(::new (Context) IntelReqdSubGroupSizeAttr(Context, CI, E));
}

// Handles num_simd_work_items.
static void handleNumSimdWorkItemsAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
if (D->isInvalidDecl())
return;
IntelReqdSubGroupSizeAttr *
Sema::MergeIntelReqdSubGroupSizeAttr(Decl *D,
const IntelReqdSubGroupSizeAttr &A) {
// Check to see if there's a duplicate attribute with different values
// already applied to the declaration.
if (const auto *DeclAttr = D->getAttr<IntelReqdSubGroupSizeAttr>()) {
const auto *DeclExpr = dyn_cast<ConstantExpr>(DeclAttr->getValue());
const auto *MergeExpr = dyn_cast<ConstantExpr>(A.getValue());
if (DeclExpr && MergeExpr &&
DeclExpr->getResultAsAPSInt() != MergeExpr->getResultAsAPSInt()) {
Diag(DeclAttr->getLoc(), diag::warn_duplicate_attribute) << &A;
Diag(A.getLoc(), diag::note_previous_attribute);
return nullptr;
}
}
return ::new (Context) IntelReqdSubGroupSizeAttr(Context, A, A.getValue());
}

static void handleIntelReqdSubGroupSize(Sema &S, Decl *D,
const ParsedAttr &AL) {
Expr *E = AL.getArgAsExpr(0);
S.AddIntelReqdSubGroupSize(D, AL, E);
}

if (D->getAttr<SYCLIntelNumSimdWorkItemsAttr>())
S.Diag(AL.getLoc(), diag::warn_duplicate_attribute) << AL;

S.CheckDeprecatedSYCLAttributeSpelling(AL);

void Sema::AddSYCLIntelNumSimdWorkItemsAttr(Decl *D,
const AttributeCommonInfo &CI,
Expr *E) {
if (!E->isValueDependent()) {
// Validate that we have an integer constant expression and then store the
// converted constant expression into the semantic attribute so that we
// don't have to evaluate it again later.
llvm::APSInt ArgVal;
ExprResult ICE = S.VerifyIntegerConstantExpression(E, &ArgVal);

if (ICE.isInvalid())
ExprResult Res = VerifyIntegerConstantExpression(E, &ArgVal);
if (Res.isInvalid())
return;
E = Res.get();

E = ICE.get();
int64_t NumSimdWorkItems = ArgVal.getSExtValue();

if (NumSimdWorkItems == 0) {
S.Diag(E->getExprLoc(), diag::err_attribute_argument_is_zero)
<< AL << E->getSourceRange();
// This attribute requires a strictly positive value.
if (ArgVal <= 0) {
Diag(E->getExprLoc(), diag::err_attribute_requires_positive_integer)
<< CI << /*positive*/ 0;
return;
}

if (const auto *A = D->getAttr<ReqdWorkGroupSizeAttr>()) {
ASTContext &Ctx = S.getASTContext();
Optional<llvm::APSInt> XDimVal = A->getXDimVal(Ctx);
Optional<llvm::APSInt> YDimVal = A->getYDimVal(Ctx);
Optional<llvm::APSInt> ZDimVal = A->getZDimVal(Ctx);

if (!(XDimVal->getZExtValue() % NumSimdWorkItems == 0 ||
YDimVal->getZExtValue() % NumSimdWorkItems == 0 ||
ZDimVal->getZExtValue() % NumSimdWorkItems == 0)) {
S.Diag(AL.getLoc(), diag::err_sycl_num_kernel_wrong_reqd_wg_size)
<< AL << A;
S.Diag(A->getLocation(), diag::note_conflicting_attribute);
// Check to see if there's a duplicate attribute with different values
// already applied to the declaration.
if (const auto *DeclAttr = D->getAttr<SYCLIntelNumSimdWorkItemsAttr>()) {
// If the other attribute argument is instantiation dependent, we won't
// have converted it to a constant expression yet and thus we test
// whether this is a null pointer.
const auto *DeclExpr = dyn_cast<ConstantExpr>(DeclAttr->getValue());
if (DeclExpr && ArgVal != DeclExpr->getResultAsAPSInt()) {
Diag(CI.getLoc(), diag::warn_duplicate_attribute) << CI;
Diag(DeclAttr->getLoc(), diag::note_previous_attribute);
return;
}
}

// If the declaration has an [[intel::reqd_work_group_size]] attribute,
// check to see if can be evenly divided by the num_simd_work_items attr.
if (const auto *DeclAttr = D->getAttr<ReqdWorkGroupSizeAttr>()) {
Optional<llvm::APSInt> XDimVal = DeclAttr->getXDimVal(Context);
Optional<llvm::APSInt> YDimVal = DeclAttr->getYDimVal(Context);
Optional<llvm::APSInt> ZDimVal = DeclAttr->getZDimVal(Context);

if (!(*XDimVal % ArgVal == 0 || *YDimVal % ArgVal == 0 ||
*ZDimVal % ArgVal == 0)) {
Diag(CI.getLoc(), diag::err_sycl_num_kernel_wrong_reqd_wg_size)
<< CI << DeclAttr;
Diag(DeclAttr->getLocation(), diag::note_conflicting_attribute);
return;
}
}
}

S.addIntelSingleArgAttr<SYCLIntelNumSimdWorkItemsAttr>(D, AL, E);
D->addAttr(::new (Context) SYCLIntelNumSimdWorkItemsAttr(Context, CI, E));
}

SYCLIntelNumSimdWorkItemsAttr *Sema::MergeSYCLIntelNumSimdWorkItemsAttr(
Decl *D, const SYCLIntelNumSimdWorkItemsAttr &A) {
// Check to see if there's a duplicate attribute with different values
// already applied to the declaration.
if (const auto *DeclAttr = D->getAttr<SYCLIntelNumSimdWorkItemsAttr>()) {
const auto *DeclExpr = dyn_cast<ConstantExpr>(DeclAttr->getValue());
const auto *MergeExpr = dyn_cast<ConstantExpr>(A.getValue());
if (DeclExpr && MergeExpr &&
DeclExpr->getResultAsAPSInt() != MergeExpr->getResultAsAPSInt()) {
Diag(DeclAttr->getLoc(), diag::warn_duplicate_attribute) << &A;
Diag(A.getLoc(), diag::note_previous_attribute);
return nullptr;
}
}
return ::new (Context)
SYCLIntelNumSimdWorkItemsAttr(Context, A, A.getValue());
}

static void handleSYCLIntelNumSimdWorkItemsAttr(Sema &S, Decl *D,
const ParsedAttr &A) {
S.CheckDeprecatedSYCLAttributeSpelling(A);

Expr *E = A.getArgAsExpr(0);
S.AddSYCLIntelNumSimdWorkItemsAttr(D, A, E);
}

// Handles use_stall_enable_clusters
Expand Down Expand Up @@ -8848,10 +8931,10 @@ static void ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D,
handleWorkGroupSize<SYCLIntelMaxWorkGroupSizeAttr>(S, D, AL);
break;
case ParsedAttr::AT_IntelReqdSubGroupSize:
handleSubGroupSize(S, D, AL);
handleIntelReqdSubGroupSize(S, D, AL);
break;
case ParsedAttr::AT_SYCLIntelNumSimdWorkItems:
handleNumSimdWorkItemsAttr(S, D, AL);
handleSYCLIntelNumSimdWorkItemsAttr(S, D, AL);
break;
case ParsedAttr::AT_SYCLIntelSchedulerTargetFmaxMhz:
handleSchedulerTargetFmaxMhzAttr(S, D, AL);
Expand Down
28 changes: 24 additions & 4 deletions clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -643,6 +643,26 @@ static void instantiateSYCLIntelLoopFuseAttr(
S.addSYCLIntelLoopFuseAttr(New, *Attr, Result.getAs<Expr>());
}

static void instantiateIntelReqdSubGroupSize(
Sema &S, const MultiLevelTemplateArgumentList &TemplateArgs,
const IntelReqdSubGroupSizeAttr *A, Decl *New) {
EnterExpressionEvaluationContext Unevaluated(
S, Sema::ExpressionEvaluationContext::ConstantEvaluated);
ExprResult Result = S.SubstExpr(A->getValue(), TemplateArgs);
if (!Result.isInvalid())
S.AddIntelReqdSubGroupSize(New, *A, Result.getAs<Expr>());
}

static void instantiateSYCLIntelNumSimdWorkItemsAttr(
Sema &S, const MultiLevelTemplateArgumentList &TemplateArgs,
const SYCLIntelNumSimdWorkItemsAttr *A, Decl *New) {
EnterExpressionEvaluationContext Unevaluated(
S, Sema::ExpressionEvaluationContext::ConstantEvaluated);
ExprResult Result = S.SubstExpr(A->getValue(), TemplateArgs);
if (!Result.isInvalid())
S.AddSYCLIntelNumSimdWorkItemsAttr(New, *A, Result.getAs<Expr>());
}

template <typename AttrName>
static void instantiateIntelSYCLFunctionAttr(
Sema &S, const MultiLevelTemplateArgumentList &TemplateArgs,
Expand Down Expand Up @@ -834,14 +854,14 @@ void Sema::InstantiateAttrs(const MultiLevelTemplateArgumentList &TemplateArgs,
}
if (const auto *IntelReqdSubGroupSize =
dyn_cast<IntelReqdSubGroupSizeAttr>(TmplAttr)) {
instantiateIntelSYCLFunctionAttr<IntelReqdSubGroupSizeAttr>(
*this, TemplateArgs, IntelReqdSubGroupSize, New);
instantiateIntelReqdSubGroupSize(*this, TemplateArgs,
IntelReqdSubGroupSize, New);
continue;
}
if (const auto *SYCLIntelNumSimdWorkItems =
dyn_cast<SYCLIntelNumSimdWorkItemsAttr>(TmplAttr)) {
instantiateIntelSYCLFunctionAttr<SYCLIntelNumSimdWorkItemsAttr>(
*this, TemplateArgs, SYCLIntelNumSimdWorkItems, New);
instantiateSYCLIntelNumSimdWorkItemsAttr(*this, TemplateArgs,
SYCLIntelNumSimdWorkItems, New);
continue;
}
if (const auto *SYCLIntelSchedulerTargetFmaxMhz =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@
// CHECK-NEXT: IBAction (SubjectMatchRule_objc_method_is_instance)
// CHECK-NEXT: IFunc (SubjectMatchRule_function)
// CHECK-NEXT: InitPriority (SubjectMatchRule_variable)
// CHECK-NEXT: IntelReqdSubGroupSize (SubjectMatchRule_function, SubjectMatchRule_function_is_member)
// CHECK-NEXT: IntelReqdSubGroupSize (SubjectMatchRule_function)
// CHECK-NEXT: InternalLinkage (SubjectMatchRule_variable, SubjectMatchRule_function, SubjectMatchRule_record)
// CHECK-NEXT: LTOVisibilityPublic (SubjectMatchRule_record)
// CHECK-NEXT: Leaf (SubjectMatchRule_function)
Expand Down
3 changes: 2 additions & 1 deletion clang/test/SemaOpenCL/invalid-kernel-attrs.cl
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ kernel __attribute__((intel_reqd_sub_group_size(0))) void kernel15() {} // expec

kernel __attribute__((intel_reqd_sub_group_size(-1))) void kernel16() {} // expected-error {{'intel_reqd_sub_group_size' attribute requires a positive integral compile time constant expression}}

kernel __attribute__((intel_reqd_sub_group_size(8))) __attribute__((intel_reqd_sub_group_size(16))) void kernel17() {} //expected-warning{{attribute 'intel_reqd_sub_group_size' is already applied with different parameters}}
kernel __attribute__((intel_reqd_sub_group_size(8))) __attribute__((intel_reqd_sub_group_size(16))) void kernel17() {} //expected-warning{{attribute 'intel_reqd_sub_group_size' is already applied with different parameters}} \
// expected-note {{previous attribute is here}}

__kernel __attribute__((work_group_size_hint(8,-16,32))) void neg1() {} //expected-error{{'work_group_size_hint' attribute requires a non-negative integral compile time constant expression}}
__kernel __attribute__((reqd_work_group_size(8, 16, -32))) void neg2() {} //expected-warning{{implicit conversion changes signedness: 'int' to 'unsigned long long'}}
Expand Down
Loading