-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Report Compiling error for recursion in SYCL kernel #3390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: gejin <ge.jin@intel.com>
/summary:run |
+@AaronBallman, who reviewed somewhat related patch to clang-tidy tool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as a narrow fix goes, this seems reasonable. However, there is some confusing code in VisitCallExpr()
that should likely be cleaned up (like the dyn_cast<FunctionDecl>(Callee)
check when Callee
is already defined as a FunctionDecl
, or an uncommented exception about pseduo destructor calls). Also, this code seems very fragile (if you don't know to call CollectKernelSet
before visiting, then the visitor won't catch any recursive calls at all) and I'm not 100% convinced it isn't overly aggressive at finding recursion. However, none of this is the result of the changes in this PR, so LGTM!
@@ -24,7 +24,7 @@ void *operator new(size_t); | |||
void usage2(myFuncDef functionPtr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be possible to add a test like
void kernel3(void) {
if constexpr (false)
kernel3();
if (false)
kernel3();
}
and verify it does not fire?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @keryell
I am afraid current implementation can't handle this as we are analyzing the call graph to recognize all recursions and don't know whether the recursive call is in a "dead" branch.
Thanks very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's reasonable to not diagnose recursion in dead branches, but we should still have test coverage (with comments) that show we've explicitly considered this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is still unaddressed, btw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AaronBallman :Note that my patch adds a test for this. See: #3475
Hi, @bader |
According to my understanding some SYCL targets are able to support recursive calls, so it's quite possible that we would want to enable it for regular SYCL mode as well. |
Signed-off-by: gejin <ge.jin@intel.com>
Signed-off-by: gejin <ge.jin@intel.com>
971f2e2
Previously, for all SYCL kernel(or SYCL_EXTERNAL) functions and functions called by them directly or indirectly, compiler will go through them one by one for sema check(recursion check is done in this period). The rules used in the check are same for all those functions. This mechanism seems can't meet our requirements now since different rules may be required for different types of SYCL kernels and all its callees(directly or indirectly). For example, recursion is not allowed for regular SYCL kernel according to spec but recursive function with "NoInline" attr is permitted for SYCL_ESIMD_KERNEL(https://github.com/intel/llvm-test-suite/blob/intel/SYCL/ESIMD/noinline_call_recursive.cpp#L13). So, we need to follow correct rules when checking different type of SYCL kernel and its callees. In order to achieve this, we traverse over the call graph to construct a map whose key is each SYCL kernel or SYCL_EXTERNAL function and value is the list of its direct or indirect callees. |
/summary:run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be possible to use a const FunctionDecl *
just about everywhere in this patch so that we get const-correctness. Can you investigate that?
clang/lib/Sema/SemaSYCL.cpp
Outdated
CollectKernelSet(Callee, FD, VisitedSet); | ||
VisitedSet.erase(Callee); | ||
for (const CallGraphNode *CGN : *SYCLFuncCGN) { | ||
if (FunctionDecl *Callee = dyn_cast<FunctionDecl>(CGN->getDecl())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (FunctionDecl *Callee = dyn_cast<FunctionDecl>(CGN->getDecl())) { | |
if (auto *Callee = dyn_cast<FunctionDecl>(CGN->getDecl())) { |
clang/lib/Sema/SemaSYCL.cpp
Outdated
if (IsCyclicSYCLFunction(Callee)) { | ||
RecursiveSet.insert(Callee); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (IsCyclicSYCLFunction(Callee)) { | |
RecursiveSet.insert(Callee); | |
} | |
if (IsCyclicSYCLFunction(Callee)) | |
RecursiveSet.insert(Callee); |
clang/lib/Sema/SemaSYCL.cpp
Outdated
return false; | ||
llvm::SmallPtrSet<FunctionDecl *, 10> VisitedSet; | ||
for (const CallGraphNode *CGN : *SYCLFuncCGN) { | ||
if (FunctionDecl *Callee = dyn_cast<FunctionDecl>(CGN->getDecl())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (FunctionDecl *Callee = dyn_cast<FunctionDecl>(CGN->getDecl())) { | |
if (auto *Callee = dyn_cast<FunctionDecl>(CGN->getDecl())) { |
clang/lib/Sema/SemaSYCL.cpp
Outdated
if (!SYCLFuncCGN) | ||
return false; | ||
for (const CallGraphNode *CGN : *SYCLFuncCGN) { | ||
if (FunctionDecl *Callee = dyn_cast<FunctionDecl>(CGN->getDecl())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (FunctionDecl *Callee = dyn_cast<FunctionDecl>(CGN->getDecl())) { | |
if (auto *Callee = dyn_cast<FunctionDecl>(CGN->getDecl())) { |
clang/lib/Sema/SemaSYCL.cpp
Outdated
llvm::SmallPtrSet<FunctionDecl *, 10> NonESIMDModeCheckSet; | ||
llvm::SmallPtrSet<FunctionDecl *, 10> ESIMDModeCheckSet; | ||
for (Decl *D : syclDeviceDecls()) { | ||
if (auto SYCLKernelDec = dyn_cast<FunctionDecl>(D)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (auto SYCLKernelDec = dyn_cast<FunctionDecl>(D)) { | |
if (auto *SYCLKernelDec = dyn_cast<FunctionDecl>(D)) { |
clang/lib/Sema/SemaSYCL.cpp
Outdated
Marker.SetTraverseEsimdKernel(SYCLKernelDef->hasAttr<SYCLSimdAttr>()); | ||
auto &SYCLFunctionCheckSet = | ||
(SYCLKernelDef->hasAttr<SYCLSimdAttr>() ? ESIMDModeCheckSet | ||
: NonESIMDModeCheckSet); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marker.SetTraverseEsimdKernel(SYCLKernelDef->hasAttr<SYCLSimdAttr>()); | |
auto &SYCLFunctionCheckSet = | |
(SYCLKernelDef->hasAttr<SYCLSimdAttr>() ? ESIMDModeCheckSet | |
: NonESIMDModeCheckSet); | |
bool IsSIMD = SYCLKernelDef->hasAttr<SYCLSimdAttr>(); | |
Marker.SetTraverseEsimdKernel(IsSIMD); | |
auto &SYCLFunctionCheckSet = IsSIMD ? ESIMDModeCheckSet : NonESIMDModeCheckSet; |
@@ -24,7 +24,7 @@ void *operator new(size_t); | |||
void usage2(myFuncDef functionPtr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's reasonable to not diagnose recursion in dead branches, but we should still have test coverage (with comments) that show we've explicitly considered this case.
Adding @erichkeane as a reviewer for visibility -- he was looking at this same chunk of code recently and may have opinions on whether these changes should be generalized for other uses or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While some of this is an improvement, it seems like a lot of it is breaking the cohesion of the marker type, it seems to be gaining dependence on even more external state. I don't believe this to be the right solution to this problem.
clang/lib/Sema/SemaSYCL.cpp
Outdated
// kernel set, this is used for error diagnostics. | ||
// Record the mapping between each SYCL kernel or SYCL_EXTERNAL function and | ||
// functions called by it. | ||
llvm::DenseMap<FunctionDecl *, llvm::SmallPtrSet<FunctionDecl *, 10>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be re-implementing a CallGraphNode. Is there a reason we need to re-store these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @erichkeane
I didn't quite understand the question here, did you ask why the patch dropped the "KernelSet" and introduced the "SYCLKernelInvokeMap" here?
Thanks very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm basically asking 'why' on a lot of things throughout this patch. I'm unconvinced that KernelSet was necessary, and I'm even further unconvinced that SYCLKernelInvokeMap is required. I think this patch is well-intentioned but is the straw that breaks the camels back here.
The complexity of this type and section of code is too much after this patch. We've hit 'unmaintainable' unfortunately. So if you can do as @AaronBallman suggested and put together a bunch of lit tests that validate the edge-cases that you found, he and I will work on refactoring this code ensuring it fixes the issues you saw.
clang/lib/Sema/SemaSYCL.cpp
Outdated
llvm::SmallPtrSet<FunctionDecl *, 10> SYCLKernelInvokeSet; | ||
Marker.WalkSYCLFunctionCG(SYCLKernel, SYCLKernelInvokeSet, | ||
SYCLFunctionCyclicCheckSet); | ||
Marker.SYCLKernelInvokeMap.insert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me the ownership model is all wrong here. The Marker type I think should just own these operations and collections. We shouldn't have to initialize 4 collections for it.
clang/lib/Sema/SemaSYCL.cpp
Outdated
(SYCLKernelDef->hasAttr<SYCLSimdAttr>() ? ESIMDModeCheckSet | ||
: NonESIMDModeCheckSet); | ||
Marker.TraverseStmt(SYCLKernelDef->getBody()); | ||
for (FunctionDecl *SYCLFunctionDec : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit doesn't seem right either. From this function's perspective, all of this should be managed by the marker it self.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this patch needs lit tests... you claim to be fixing a reproducer, but I don't see an added test, nor anything for the esimd versions you're attempting to allow.
I think we should probably consider this from a pile of test cases that test the expected-good, expected-bad, and edge case functionality. @keryell brought up the dead code test case, but I suspect an uninstantiated template will run into similar problems, and I'm starting to worry about interplay with macros as well. We may also need to get the SYCL spec clarified as https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:language.restrictions.kernels just says Can you put together a bunch of really good lit tests that @erichkeane and I can look over? We can work backwards from the tests to see whether we need more refactoring to be done here or not. |
Sure, I will add some lit tests. |
I'm not aware of such an ESIMD feature. @kbobrovs, are you? |
No, this is not supported in ESIMD neither. |
Thanks! I won't consider it then when going through the re-implementation/refactor here. |
Yes, so I believe ESIMD should obey the same rules that exist for regular SYCL kernels regarding recursive functions, right? @jinge90, if your earlier patch triggered errors in ESIMD tests, then I think those tests should be fixed. Do you have a list of those tests maybe? |
@fveselov, your PR intel/llvm-test-suite#143 adds recursive ESIMD test. This feature is not supported by ESIMD, so can you please remove it. @vladimirlaz, please add me as a code owner for ESIMD llvm-test-suite tests. |
@kbobrovs I added code owners for intel/llvm-test-suite few weeks ago. Now you are code owner for ESIMD tests: https://github.com/intel/llvm-test-suite/blob/intel/.github/CODEOWNERS#L30 |
Thanks! |
Hi, @DenisBakhvalov |
FYI: Don't bother updating this patch other than adding some lit tests. We are not going to accept the patch, instead I am going to include the tests/test situation into a rewrite of most of the code here. It is horribly disorganized and intermingled with other behaviors, so we're going to clean it up in the next few days. |
Signed-off-by: gejin <ge.jin@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So at the base of it, it seems that all this does now is switch a warning to an error. I don't really have a problem with that anymore, though I'm a little odded-out by the fact that the error message already existed.
This also only steps on my re-write in 1 place (the SemaSYCL change), but that should be easy enough to merge.
Hi, @AaronBallman and @erichkeane
|
Agreed. Seeing your change here, I don't have any problems with it, it is pretty innocuous in that it only reverts a warning to an error in a case that the device can't handle anyway. I'm OK with that. I've also captured in my change a couple of other things, including properly not-checking if-constexpr dead branches as @keryell requested. That part likely requires a little work in the CallGraph object to make sure we don't evaluate the dead side of it. Perhaps that is another follow-up to my refactor. Another thing I have on my list is to fix how we identify 'kernel body wrappers' that get added in the library. THAT is likely going to be another difficult, perhaps separate task. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the changes in this patch are reasonable as far as they go.
@@ -34,7 +34,7 @@ template <typename name, typename Func> | |||
__attribute__((sycl_kernel)) void kernel_single_task2(const Func &kernelFunc) { | |||
// expected-note@+1 {{called by 'kernel_single_task2}} | |||
kernelFunc(); | |||
// expected-warning@+1 2{{SYCL kernel cannot call a recursive function}} | |||
// expected-error@+1 2{{SYCL kernel cannot call a recursive function}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this diagnostic issued twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still curious about this.
@erichkeane I was thinking about the rewrite of this logic rather than sleeping last night, and one thing we should discuss is where this kind of check lives. This seems more like an intraprocedural analysis and that suggests to me that it should either live in |
So there are
While I think putting this in the static-analyzer is a good place for it, I don't think it can live in I think |
One concern I do have with this patch is that we're strengthening a warning that's an error by default into always being an error, but the analysis we're performing isn't sound. Typically, errors are typically reported only for things where we know we are always diagnosing the problematic construct (because warnings can be disabled in the cases where the analysis is incorrect). Pushing the check into CSA helps make it clear that this is a best effort attempt to diagnose problematic code rather than going to catch all the programmer's mistakes. |
FYI: PR#219 disabling ESIMD recursion tests is merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM!
/summary:run |
Can you please explain that SYCL kernel cannot call a recursive function ? |
@zjin-lcf It is forbidden in the SYCL standard to align to the minimal feature set of the accelerators envisioned by the SYCL standard. |
I assume that recursion is not useful for most application developers porting a CPU program; otherwise, the feature would be included in SYCL Before. I will try to convert recursion to iteration. |
Signed-off-by: gejin ge.jin@intel.com
Latest compiler doesn't report error for usage of recursive code in SYCL kernel. If some recursive code is used in SYCL kernel, latest compiler can compile well but runtime error may happen, following code can compile well but crashes using "-O0" on GPU:
recursion.cpp
According to SYCL 2020 spec(https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:language.restrictions.kernels), recursion is not allowed in SYCL kernel, we had better report compiling error for this illegal behavior. Moving runtime crash to compiling error is more friendly to users too.