-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Implement no-decomposition for kernel types that don't need it. #2477
Conversation
Kernel arguments don't need to be decomposed unless they contain a pointer or a special type, so we don't want to decompose structs/arrays containing these. This patch accomplishes that. First, we add a new attribute without a spelling that is added during the 'checking' stage, that the later vistiors can then check to see if decomposition is necessary. Next, we add a new checker to run during the checking stage that applies the attribute based on logic. Basically, a container doesn't need to be decomposed if all of its 'children' are acceptable, so we simply hold a stack of the containers to tell which need to be decomposed. This, of course, works recursively. Finally, we add some new calls to the visitor that handle the case of a 'simple array' and a 'simple struct', which are ones that don't require decomposition.
Draft PR, I think the implementation is done and the test fixes are just 'lifting work', so I figured I'd show it off. The array init needed to do a bit more work, but the rest seems pretty trivial. Test failures are: This HOPEFULLY doesn't result in any check-sycl failures as this should not change anything at runtime (besides fewer kernel params), but I want to see what the buildbots have to say. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will need to review this again (and possibly again :-) ). Quick comments/questions so far.
Coolbeans, looks like we have no check-sycl failures, just the 10(11 on the server, but the static-lib tests are REALLY suspicious and I suspect don't belong in check-clang, particularly since they seem platform specific), so it appears the surgery hasn't broken anything in runtime :) |
This is necessary since the size-checker needs opt-in (so that it properly reflects the opencl kernel arguments). When they are in the same invocation, the size-checker is erronously called thinking that the first time we see a struct that it doesn't need to be decomposed. I opted to use the same visitor, since it doesn't have state.
Down to 5 failures, only CodeGenSYCL tests left: Also fixed a couple of issues I found along the way. |
Alright, all of the existing tests now work (at least the ones that I reproduced...). After the build-bots run I'll fix whatever they find, then take a look at writing a test to validate the non-decomp/decomp itself. Additionally, at one point, we might want to consider what we do when the kernel object ITSELF doesn't need decomposition! This is a bit of a special case that might end up being quite a bit of surgery so I'll keep it for a separate patch. |
Alright, i think this is ready for review! I MIGHT have that test failure on the servers that I mentioned earlier, but I'll have to check it out when the build-bot is accessible again. |
…signed long long to work here
Ok, I fixed the last issue! Ready for review @Fznamznon @premanandrao @elizabethandrews . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not looked at tests yet and I need to go over this again tomorrow but I have some initial comments and questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I spend some time on this and it looks ok to me except for a couple of small nits and request for one more test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @premanandrao are you ok with this?
Reviewing this now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any further comments.
/summary:run |
…_wrapper * upstream/sycl: (1533 commits) [SYCL] XFAIL sub_group shuffle tests on GPU [SYCL] Add support for L0 loader validation layer (intel#2520) [NFC][LIT] Temporary disable function pointers as they hang on L0 (intel#2544) [SYCL] use release version of OpenCL ICD loader [SYCL] Improve testing of host-task (intel#2540) Revert 1291215 [SYCL] Fix warning caused by [[nodiscard]] attribute (intel#2545) [SYCL] Workaround windows build failure [SYCL] Remove kernel_signature_start from int header (intel#2537) [SYCL] Fix ABI tests in post-commit (intel#2539) [SYCL][DOC] Update C-CXX-StandardLibrary doc to align with latest status (intel#2529) [SYCL][NFC] Fix static code analysis concerns (intel#2531) [SYCL][NFC] Improve testing for accessor_property_list (intel#2532) [SYCL] Avoid overuse of CPU on wait read-write lock loop (intel#2525) [SYCL] Implement no-decomposition for kernel types that don't need it. (intel#2477) [SYCL] Add group algorithm constraints (intel#2462) [BuildBot] Uplift Windows GPU RT from 8673 to 8778 (intel#2533) [SYCL][LIT][NFC] Extend ABI test suite (intel#2522) [SYCL][DebugInfo] Reinstate source locations for some kernel instructions (intel#2527) [SYCL][NFC] Replace the deprecated VectorType::getNumElements() (intel#2524) ...
Kernel arguments don't need to be decomposed unless they contain a
pointer or a special type, so we don't want to decompose structs/arrays
containing these. This patch accomplishes that.
First, we add a new attribute without a spelling that is added during the
'checking' stage, that the later vistiors can then check to see if
decomposition is necessary.
Next, we add a new checker to run during the checking stage that applies
the attribute based on logic. Basically, a container doesn't need to be
decomposed if all of its 'children' are acceptable, so we simply hold a
stack of the containers to tell which need to be decomposed. This, of
course, works recursively.
Finally, we add some new calls to the visitor that handle the case of a
'simple array' and a 'simple struct', which are ones that don't require
decomposition.