forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 77
merge main into amd-staging #1035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…er. [NFCI] (llvm#174890) This function is invoked only for ELF targets, therefore it has been moved to the ELF-specific streamer. An assertion has been added to catch its invocations outside of an invocation that targets ELF.
…vm#174695) Remove unneeded load instructions and only remain one comparison instruction.
…antCopyElimination (llvm#174706) This patch is like what Xqcibi did in llvm#174358.
…t() (llvm#174438) Some machines are able to make use of AUIPC + ADDI or LUI + ADDI fusion, make sure to consider that in the cost model for `RISCVTTIImpl::getConstantPoolLoadCost()`.
…lvm#174811) The keep-registers mode isn't super useful without disabling explicit-locals, as the local gets/sets are irrelevant noise in most cases. Switching this test makes the output much more concise and will make upcoming changes easier to review.
Not sure if this will fix the problem because I don't have a 32-bit arm machine to test with.
…vm#173268) This commit adds support for family specific support for the following intrinsics: - ldmatrix - stmatrix - mma.block_scale, mma.sp.block_scale - redux.sync - cvt.rs - clusterlaunchcontrol - setmaxnreg - tcgen05.mma Removed `hasTcgen05Instructions` function in the favour of `hasTcgen05InstSupport` Updated wmma.py script with family specific support and added new tests
This allows speculating recursively speculatable operations containing `fir.result`. Note that making it Pure does not allow speculating `fir.result` itself from its containing operation, since it is a terminator.
Fixes a failure on llc for ubsan builds: ../lib/Target/RISCV/RISCVAsmPrinter.cpp:552:7: runtime error: downcast of null pointer of type 'RISCVTargetStreamer' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../lib/Target/RISCV/RISCVAsmPrinter.cpp:552:7
llvm#173659) This PR fixes a crash by validating the type of the `kind` attribute. For `vector.contract` and `vector.outerproduct`, the verifier now emits an error when `kind` is not a CombiningKindAttr. Fixes llvm#173555.
…e` to `shufflevector` (llvm#169110) Resolves llvm#169058. This adds ~~an InstCombine pass~~ a TTI hook to the WebAssembly backend that folds `i8x16.swizzle` and `i8x16.relaxed.swizzle` operations to `shufflevector` operations if their mask operands are constant. This is mainly useful for abstractions over the raw intrinsics--for instance, in architecture-generic SIMD code that may not be able to expose the constant shuffles due to type system limitations. I took most of this from the x86 backend (in particular, `simplifyX86vpermilvar` in `X86InstCombineIntrinsic`), and adapted it for the WebAssembly backend. There wasn't any previous `instCombineIntrinsic` method on the WebAssembly `TargetTransformInfo`, so I added it. Right now, this swizzle optimization is the only one it performs. As I noted in the transform itself, the "relaxed" swizzle actually has stricter preconditions than the non-relaxed one. If a non-negative but still out-of-bounds index is provided, the "relaxed" swizzle can choose between returning 0 and the lane at the index modulo 16. However, it must make the same choice every time, and we don't know which choice the runtime will make, so we can't constant-fold it. The regression tests were mostly generated by Claude and adapted a bit by me (I tried to follow the [InstCombine contributor guide](https://llvm.org/docs/InstCombineContributorGuide.html#tests)). There was previously no WebAssembly subdirectory within the InstCombine tests, so I created that too; as of now, the swizzle fold test is the only file in it. Everything else was written by myself (well, partly copy-pasted from the x86 backend). I'm not sure how to write an Alive2 test for this; I can't find any examples where the input is an arbitrary constant.
This patch handles the special case where an extract value yields an aggregate result, which then is used as an argument to a store. The SPIRV BE uses special intrinsics (`spv_extractv` and `spv_store`) to represent these through IRTranslator, however this creates a problem: `spv_store` is called as a function, and IRTranslator cannot handle arguments that take more than a vreg. For other functions, the aggregate argument replacement pass would have solved things, but it does not apply here. Hence, we apply the same mutate-into-Int32 solution here when dealing with stores, and restore the extract value's type (which we have available as a ValueAttr) during instruction selection.
…, Z) (llvm#173808) We use fli+fneg to generate negative float, eliminate the fneg for fma. Fold fma to vfnmsac.vf,vfnmsub.vf, vfnmacc.vf, vfnmadd.vf --------- Co-authored-by: Craig Topper <craig.topper@sifive.com>
CAPIIR includes this in some of its source files, so we need to ensure the header is around.
Collaborator
Author
ronlieb
approved these changes
Jan 8, 2026
Collaborator
Author
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.