-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCLomatic] Merge branch 'origin/sycl' into SYCLomatic #252
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
According to the SYCL 2020 spec, section D.1. What has changed from SYCL 1.2.1 to SYCL 2020: > The program class has been removed and replaced with a new > class `kernel_bundle`, which provides similar functionality in a > type-safe and thread-safe way. Removing of `program_impl` class will be done with a separate commit since it's not an ABI-breaking change and some performance analysis should be done in scope of that removal. Tests depending on `sycl::program` were removed in intel/llvm-test-suite#1187
As Fortran 2018 18.2.3.3, the intrinsic module procedure C_F_POINTER(CPTR, FPTR [, SHAPE]) associates a data pointer with the target of a C pointer and specify its shape. CPTR shall be a scalar of type C_PTR, and its value is the C address or the result of a reference to C_LOC. FPTR is one pointer, either scalar or array. SHAPE is a rank-one integer array, and it shall be present if and only if FPTR is an array. C_PTR is the derived type with only one component of integer 64, and the integer 64 component value is the address. Build the right "source" fir::ExtendedValue based on the address and shape, and use associateMutableBox to associate the pointer with the target of the C pointer. Refactor the getting the address of C_PTR to reuse the code. Reviewed By: jeanPerier Differential Revision: https://reviews.llvm.org/D132303
This adds new VFCVT pseudoinstructions that take a rounding mode operand. A custom inserter is used to insert additional instructions to change FRM around the VFCVT. Some of this is borrowed from D122860, but takes a somewhat different direction. We may migrate to that patch, but for now I was trying to keep this as independent from RVV intrinsics as I could. A followup patch will use this approach for FROUND too. Still need to fix the cost model. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D133238
For now, clang and gcc both failed to generate sae version from _mm512_cvt_roundps_ph: https://godbolt.org/z/oh7eTGY5z. Intrinsic guide description is also wrong, which will be update soon. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D132641
The issue was with processing two subregs of the same reg are used in the same instruction (e.g. inline asm): "def early-clobber" and other just "def". Register coalescer ran in bad recursion if the early clobbered subreg is second in the following sequence of COPYs. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D127136
support for OpenMP programs. This is 5th of 6 patches started from https://reviews.llvm.org/D100181 This plugin code, when loaded in gdb, adds a few commands like ompd icv, ompd bt, ompd parallel. These commands create an interface for GDB to read the OpenMP runtime through libompd. Reviewed By: @dreachem Differential Revision: https://reviews.llvm.org/D100185
This commit removes the `relocTargets` vector, and instead makes the code reconstruct the referent addresses from the relocated instructions. This will allow us to move `applyOptimizationHints` from `ConcatInputSection::writeTo` to a separate pass that parses and applies LOHs in one step, on a per-file basis. This will improve performance, as parsing is currently done serially in `ObjFile::parse`. I opted to remove the sanity check that ensures that all relocations within a LOH point to the same symbol. This completely eliminates the need to search through relocations. It is my understanding that mismatched relocation targets should not be present in valid object files, so it's unlikely that the removal will lead to mislinks. Differential Revision: https://reviews.llvm.org/D133274
[dcl.fct.def.coroutine]p12 says: > If both a usual deallocation function with only a pointer parameter > and a usual deallocation function with both a pointer parameter and a > size parameter are found, then the selected deallocation function > shall be the one with two parameters. However, the sized deallocation function is disabled by default for ABI reasons. This leads the sentence never get tested and covered. This commit tries to add a test for it
According to [dcl.fct.def.coroutine]p12, the program should be ill-formed if the promise_type contains operator delete but none of them are available. But this behavior was not tested before. This commit adds the tests for it.
The if-statement should check whehter TFLITE is on or not rather than if the variable is specified. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D132902
This matches the behavior for all the other -fopenmp options, as well as -frtlib-add-rpath. For context, Fedora passes this flag by default in case OpenMP is used, and this results in a warning if it (usually) isn't, which causes build failures for some programs with unnecessarily strict build systems (like Ruby). Differential Revision: https://reviews.llvm.org/D133316
Differential Revision: https://reviews.llvm.org/D133332
…other threads This will be used as a replacement for selecting over a pipe fd, which does not work on windows. The posix implementation still uses a pipe under the hood, while the windows version uses windows event handles. The idea is that, instead of writing to a pipe, one just inserts a callback, which does whatever you wanted to do after the bytes come out the read end of the pipe. Differential Revision: https://reviews.llvm.org/D131160
CONFLICT (content): Merge conflict in clang/lib/Driver/Driver.cpp
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695 (although it still struggles with avx512 predicate numbers which had to be done manually) SSE numbers are still too low for FCMP_ONE/FCMP_UEQ cases which expand to a more complex sequence than the existing 'ExtraCost' system can manage.
This simplifies completeness comparisons against OpenCLBuiltins.td and also makes the header no longer "claim" the argument name identifiers. Continues the direction set out in D119560.
… path The main difference is that this preserves intermediate rounding steps, which the other route doesn't. This aligns bfloat16 more with half floats, which use this path on most targets. I didn't understand what the difference was between these softening approaches when I first added bfloat lowerings, would be nice if we only had one of them. Based on @pengfei 's D131502 Differential Revision: https://reviews.llvm.org/D133207
…haredMemory.cpp (NFC)
Example of BraceWrapping AfterClass is wrong Differential Revision: https://reviews.llvm.org/D133087
Previously, the heuristic was simply to look for template argument- specific keywords, such as typename, class, template and auto that are preceded by a left angle bracket <. This changes the heuristic to instead look for a left angle bracket < preceded by a right square bracket ], since according to the C++ grammar, the template arguments must *directly* follow the introducer. (This sort of check might just end up being *too* aggressive) This patch also adds a bunch more token annotator tests for lambdas, specifically for some of the stranger forms of lambdas now allowed as of C++20 or soon-to-be-allowed as part of C++23. Fixes llvm/llvm-project#57093 This does NOT resolve the FIXME regarding explicit template lists, but perhaps it gets closer Differential Revision: https://reviews.llvm.org/D132295
The old replacements will be removed soon: - `%linalg_test_lib_dir` - `%cuda_wrapper_library_dir` - `%spirv_wrapper_library_dir` - `%vulkan_wrapper_library_dir` - `%mlir_runner_utils_dir` - `%mlir_integration_test_dir` Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D133270
When the only ADR instruction we have is the 16-bit thumb one then all constant pool entries need to be 4-byte aligned, as tADR has an offset that's a multiple of 4. It looks like previously there happened to be no situations in which we encountered a constant pool entry with alignment less than 4, so failing to do this didn't cause any problems, but the expansion of cttz to a table added by D128911 does use a constant pool with alignment 1, so we now need to handle it correctly. Differential Revision: https://reviews.llvm.org/D133199
Fix incorrectly formatted python file.
…dicates These require special handling to account for their expansion in lowering. I'm trying very hard not to have to add predicate specific costs - but it might be inevitable.....
Split the read thread support from Communication into a dedicated ThreadedCommunication subclass. The read thread support is used only by a subset of Communication consumers, and it adds a lot of complexity to the base class. Furthermore, having a dedicated subclass makes it clear whether a particular consumer needs to account for the possibility of read thread being running or not. The modules currently calling `StartReadThread()` are updated to use `ThreadedCommunication`. The remaining modules use the simplified `Communication` class. `SBCommunication` is changed to use `ThreadedCommunication` in order to avoid changing the public API. `CommunicationKDP` is updated in order to (hopefully) compile with the new code. However, I do not have a Darwin box to test it, so I've limited the changes to the bare minimum. `GDBRemoteCommunication` is updated to become a `Broadcaster` directly. Since it does not inherit from `ThreadedCommunication`, its event support no longer collides with the one used for read thread and can be implemented cleanly. The support for `eBroadcastBitReadThreadDidExit` is removed from the code -- since the read thread was not used, this event was never reported. Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.llvm.org/D133251
Resolve min/max conflict in the same way as it was done here: intel/llvm#1339 Needed to fix this failure: https://github.com/intel/llvm/actions/runs/3107743966/jobs/5036187282
This commit renames metadata nodes introduced by the optional device features design document from using `intel_` prefix to use `sycl_` prefix. Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
intel/llvm#6852 incidentally used a non-constexpr solution for determining the size of a string, which causes failures in select build configurations. This commit amends this by using sizeof on a static array of characters instead of std::strlen. Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
This commit adds a new module-level metadata node !intel_sycl_aspects which contains metadata pairs of the enum element names and integral values of the SYCL aspects enum, identified by the [[__sycl_detail__::sycl_type(aspect)]] attribute. This commit also makes the SYCLPropagateAspectsUsage pass read !intel_sycl_aspects and use this information to determine the value of the fp64 aspect instead of relying on it being synchronized with the SYCL implementation headers. Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Currently PI_KERNEL_MAX_SUB_GROUP_SIZE in the PI OpenCL backend uses the max work item sizes as the input to the corresponding OpenCL query to avoid truncation. However, using the max work item sizes in all dimensions may exceed the total max work items limitations. To prevent this limit from being exceeded, this commit changes the query to only use the max work-item size in the first dimension and using 1s in the other dimensions. Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
This reverts commit 2fb1d0e.
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
AndyCHHuang
reviewed
Sep 29, 2022
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
YuriPlyakhin
approved these changes
Sep 30, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.