-
Notifications
You must be signed in to change notification settings - Fork 736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Fix Ambiguity in Overloaded Unary Minus Operator for bfloat16 #15393
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…LIntelRegister, SYCLIntelMemory) This patch uses MutualExclusions tablegen support to allow us to remove a custom diagnostic checking codes with FPGA attributes: [[intel:fpga_register]] and [[intel::fpga_memory]]. No test is added as we alreday have an existing LIT test (SemaSYCL/local.cpp) that shows the behavior.
…ntel#14364) This commit makes the following fixes to group algorithms: * Some GroupNonUniform SPIR-V builtins were incorrectly named as they were bundled together with their KHR Group SPIR-V equivalents. These have been renamed to map correctly to the right SPIR-V operations. * `sycl::marray` is now considered when checking for arithmetic types, making it usable in group broadcast operations and the functions that use them. * The representation of `bool` in `sycl::vec` has been changed to unsigned to match the representation picked by `ConvertToOpenCLType`. * The representation of `char` in `sycl::vec` has been changed to to match the representation picked by `ConvertToOpenCLType` to avoid cases where signedness would cause type mismatches. --------- Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
intel#14502) Only emit the provided values as annotations in the LLVM IR. The NVPTX backend will pad missing values with 1s. This suits the fact that the attribute must provide as many values as the dimensionality of the work-group, and we can assume that the work-group size of unused dimensions is 1.
It is difficult to understand how to detect all possible data members which can cross ABI boundaries and cause problems because of dual ABI issue. For now using this approach which covers most of the classes, more than currently covered by "sycl/test/abi/layout*" tests. Current hits are in sycl::detail CG classes and subclasses (MFileName, MFunctinName, MKernelName and PipeName string data members). Even though I'm not sure if those cross ABI boundaries or not, I think we have to be conservative and fix those too. Using -fdump-record-layouts-complete to get dump for all complete record types, see description in https://reviews.llvm.org/D104484 for details.
This moves some of the members in the handler class to its impl class. Doing so allows developers to change these arguments without breaking ABI. This also moves more implementation details, such as command-group classes, launch configuration information, argument information and HostTask tracking, into sources to avoid hard-to-find ABI breaks in the communication between headers and runtime library. In addition to this, the following improvements are made: * The HostKernel class has been simplified to no longer have call and runOnHost functions. * The HostTask wrapper class has been moved to sources and the owner has been changed from a unique_ptr to a shared_ptr, which prevents the need for including host_task_impl.hpp in odd places. --------- Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
The `LIT_OPTS` were different compared to Linux so the output logs from CI runs is different. I noticed this when searching for passed tests in the Windows log and noticed it wasn't there. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
…el#13597) This commit folds the implementation of host_half_impl::half into half_impl::half and making the vector element representation the same as the half representation. This allows us to avoid strict alias violation for half vectors in their operator[] implementations. Note that this is marked as an ABI break as it removes symbols on Windows, despite these symbols never being in the library. --------- Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
I recently found out ocloc tests weren't running in Windows CI because the ocloc tool wasn't installed on the runners. When trying to fix it, I [hit](https://github.com/intel/llvm/actions/runs/9909706895/job/27379893885?pr=14114) two failures that would have always failed if we were testing (the other two are going to be fixed [here](intel#14556)). The first fix disables the test on Windows because it is using `pvc` which is not available on Windows. The seconds add a requirement for the OCL CPU driver to be installed because the test is using `ocloc-aot`, as per the [ocloc-aot doc](https://github.com/intel/llvm/blob/56e88d591c52a978abdd5e4279853311cae4a55e/opencl/opencl-aot/README.md?plain=1#L11). The CI testing in this PR is only confirming I didn't break the tests on non-win ocloc because I haven't enabled Windows ocloc testing yet, but I manually verified the fix. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
In intel#12682 the mutating operators for swizzles (+=, -=, ..., ++, --) were reverted to be members rather than friends. Since swizzles mutate the underlying vec rather than themselves these operators should take and return constant references instead, which this commit implements. --------- Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
kernel_compiler include_file support shouldn't have files that might collide. Spec has been recently clarified as well ( intel@a6d8758 )
Update existing SYCL based driver tests to use the new offload model and retain the equivalent old offloading model tests be creating copies. These tests use the --offload-new-driver and --no-offload-new-driver options to force a particular model to follow.
…mize code (intel#14577) This commit makes several improvements to the `CodeGenModule::addGlobalIntelFPGAAnnotation` method: - Updating the comment style to be Doxygen compliant for the `addGlobalIntelFPGAAnnotation` function. - Replaced `getAs<RecordType>()` with `castAs<RecordType>()` after confirming that the variable declaration type is guaranteed to be a `RecordType`. This change removes the need for a null check and streamlines the code. - Modified the base class iteration loop to use a reference (`const auto &Base`) to avoid unnecessary copies of `CXXBaseSpecifier` objects. These changes enhance code readability, improve code documentation and potentially increase performance by avoiding unnecessary dynamic type checks and copies. --------- Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
smanna12
had a problem deploying
to
WindowsCILock
September 13, 2024 15:21
— with
GitHub Actions
Error
smanna12
had a problem deploying
to
WindowsCILock
September 13, 2024 16:13
— with
GitHub Actions
Error
smanna12
temporarily deployed
to
WindowsCILock
September 13, 2024 16:27
— with
GitHub Actions
Inactive
smanna12
changed the title
Fix build
[SYCL] Fix Ambiguity in Overloaded Unary Minus Operator for bfloat16
Sep 13, 2024
tahonermann
approved these changes
Sep 13, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Thank you @tahonermann for review! |
smanna12
temporarily deployed
to
WindowsCILock
September 13, 2024 17:33
— with
GitHub Actions
Inactive
smanna12
temporarily deployed
to
WindowsCILock
September 13, 2024 19:38
— with
GitHub Actions
Inactive
smanna12
temporarily deployed
to
WindowsCILock
September 13, 2024 20:09
— with
GitHub Actions
Inactive
bso-intel
approved these changes
Sep 13, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you @bso-intel for reviews! |
@intel/llvm-gatekeepers, this PR is ready to merge. Thank you |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit resolves an overload resolution ambiguity on Windows self-builds in downstream pulldown when using the unary minus operator with bfloat16. The operator now takes a const-qualified argument, enabling its use with both lvalues and rvalues and fixing the reported build error.