forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR] Fix an assert that contains a mistake in conditional operator (#95668) #6
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…lvm#95668) This is described in (N2) https://pvs-studio.com/en/blog/posts/cpp/1126/ so caught by the PVS Studio analyzer. Warning message - V502 Perhaps the '?:' operator works in a different way than it was expected. The '?:' operator has a lower priority than the '+' operator. LoopEmitter.cpp 983 V502 Perhaps the '?:' operator works in a different way than it was expected. The '?:' operator has a lower priority than the '+' operator. LoopEmitter.cpp 1039 The assert should be assert(bArgs.size() == reduc.size() + (needsUniv ? 1 : 0)); since + has higher precedence and ? has lower. This further can be reduce to assert(aArgs.size() == reduc.size() + needsUniv); because needUniv is a bool value which is implicitly converted to 0 or
These are redundant with the unsuffixed versions, and have a name collision with surprising behavior when the base intrinsic is used with v2bf16. The global and flat variants should be removed too, but those are complicated due to using v2i16 in place of the natural v2bf16. Those cases can soon be completely deleted in favor of atomicrmw. The GlobalISel codegen change is broken and substitutes handling as bf16 for handling as f16, but it's a bug that this passed the IRTranslator in the first place.
We should just support these for all register types.
Summary: The HSA headers existed previously in `include/hsa.h` and were moved to `include/hsa/hsa.h` in a later ROCm version. The include headers here were originally designed to favor a newer one. However, this unintentionally prevented the dyanmic HSA's `hsa.h` from being used if both were present. This patch changes the order so it will be found first. Related to llvm#95484.
Switch to common::visit more call sites. Test plan: ninja check-all
As mentioned in [here](llvm#95462 (comment)), the formatting of the comments have been fixed. Also added comments before literal arguments.
This ensures that the CI script controls which file extensions are considered instead of letting git-clang-format apply its own filtering rules. In particular, this properly handles libc++ extension-less headers which were passed to git-clang-format, but then dropped by that tool as having an unrecognized extension.
Reland: llvm#95456 This patch improves the ROCDL gpu serialization API by: - Introducing the enum `AMDGCNLibraries` for specifying the AMD GCN device code libraries to use during linking. - Removing `getCommonBitcodeLibs` in favor of `AMDGCNLibraries`. Previously `getCommonBitcodeLibs` would try to load all AMD GCN bitcode librariesm now it will only load the requested libraries. - Exposing the `compileToBinary` method and making it virtual, allowing downstream users to re-use this method. - Exposing `moduleToObjectImpl`, this method provides a prototype flow for compiling to binary, allowing downstream users to re-use this method. - It also avoids constructing the control variables if no device libraries are being used. - Changes the style of the error messages to be composable, ie no full stops. - Adds an error message for when the ROCm toolkit can't be found but it was required.
Summary: Not all platforms support this function or header, but it was being included by every test. Move it inside of the `ifdef` for the only user, which is aarch64.
…ystem malloc in overlay mode. (llvm#95845) This causes an issue in overlay mode: llvm#95736 (comment)
) Initially, the scudo allocator would return an error if the user attempted to set the cache capacity (i.e. the number of possible entries in the cache) above the maximum cache capacity. Now the allocator will resort to using the maximum cache capacity in this event. An error will still be returned if the user attempts to set the number of entries to a negative value.
…ce (llvm#95616) This allows shrinking for size the cold part of the code, without sacrificing performance.
…#95853) Reverts llvm#95069 Broke windows bot
The feature was only exposed for training and was immediately dropped on the training side. It was bulk-copied into the test model generator, where it had no effect (the generator always returns a constant). In the AOT + test model case, since the test model returns a constant, all input features are pruned by the AOT compiler, so its presence/absence doesn't matter.
Its possible for handleResultPattern to emit helpers, these helpers cannot be interleaved with pushing into the array. Emit into a separate string to enable helpers to be emitted before the population of vector. Signed-off-by: Jacques Pienaar <jpienaar@google.com>
Summary: Functions with high discrepancy (measured by matched function blocks) can be ignored with an added command line argument for better performance. Test Plan: Added stale-matching-min-matched-block.test --------- Co-authored-by: Amir Ayupov <aaupov@fb.com>
On macOS, to make DYLD_INSERT_LIBRARIES and the Python shim work together, we have a workaroud that copies the "real" Python interpreter into the build directory. This doesn't work when running in a virtual environment, as the copied interpreter cannot find the packages installed in the virtual environment relative to itself. Address this issue by copying the Python interpreter into the virtual environment's `bin` folder, rather than the build folder, when the test suite detects that it's being run inside a virtual environment. I'm not thrilled about this solution because it puts a file outside the build directory. However, given virtual environments are considered disposable, this seems reasonable.
Remove Kate; Apple is now represented by Oliver (and myself).
…m#95853) Updated MapDynamicShadow callsite in asan_win.
This was referenced Jun 18, 2024
1 similar comment
shiltian
pushed a commit
that referenced
this pull request
Aug 26, 2024
…104523) Compilers and language runtimes often use helper functions that are fundamentally uninteresting when debugging anything but the compiler/runtime itself. This patch introduces a user-extensible mechanism that allows for these frames to be hidden from backtraces and automatically skipped over when navigating the stack with `up` and `down`. This does not affect the numbering of frames, so `f <N>` will still provide access to the hidden frames. The `bt` output will also print a hint that frames have been hidden. My primary motivation for this feature is to hide thunks in the Swift programming language, but I'm including an example recognizer for `std::function::operator()` that I wished for myself many times while debugging LLDB. rdar://126629381 Example output. (Yes, my proof-of-concept recognizer could hide even more frames if we had a method that returned the function name without the return type or I used something that isn't based off regex, but it's really only meant as an example). before: ``` (lldb) thread backtrace --filtered=false * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10 frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25 frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12 frame #3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12 frame #4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10 frame #5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12 frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10 frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10 frame #8: 0x0000000183cdf154 dyld`start + 2476 (lldb) ``` after ``` (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10 frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25 frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12 frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10 frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10 frame #8: 0x0000000183cdf154 dyld`start + 2476 Note: Some frames were hidden by frame recognizers ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[MLIR] Fix an assert that contains a mistake in conditional operator (llvm#95668)
This is described in (N2) https://pvs-studio.com/en/blog/posts/cpp/1126/
so caught by the PVS Studio analyzer.
Warning message -
V502 Perhaps the '?:' operator works in a different way than it was
expected. The '?:' operator has a lower priority than the '+' operator.
LoopEmitter.cpp 983
V502 Perhaps the '?:' operator works in a different way than it was
expected. The '?:' operator has a lower priority than the '+' operator.
LoopEmitter.cpp 1039
The assert should be
assert(bArgs.size() == reduc.size() + (needsUniv ? 1 : 0));
since + has higher precedence and ? has lower.
This further can be reduce to
assert(aArgs.size() == reduc.size() + needsUniv);
because needUniv is a bool value which is implicitly converted to 0 or
AMDGPU: Remove .v2bf16 buffer atomic fadd intrinsics (llvm#95783)
These are redundant with the unsuffixed versions, and have a name
collision with surprising behavior when the base intrinsic is used with
v2bf16.
The global and flat variants should be removed too, but those are complicated
due to using v2i16 in place of the natural v2bf16. Those cases can soon be
completely deleted in favor of atomicrmw.
The GlobalISel codegen change is broken and substitutes handling as bf16
for handling as f16, but it's a bug that this passed the IRTranslator in the first
place.
AMDGPU: Cleanup selection patterns for buffer loads (llvm#95378)
We should just support these for all register types.
[Offload] Change HSA header search order (llvm#95769)
Summary:
The HSA headers existed previously in
include/hsa.h
and were moved toinclude/hsa/hsa.h
in a later ROCm version. The include headers herewere originally designed to favor a newer one. However, this
unintentionally prevented the dyanmic HSA's
hsa.h
from being used ifboth were present. This patch changes the order so it will be found
first.
Related to llvm#95484.
[Flang] Switch to common::visit more call sites (llvm#90018)
Switch to common::visit more call sites.
Test plan: ninja check-all
[flang] Fix comments and formatting. (NFC) (llvm#95786)
As mentioned in
here,
the formatting of the comments have been fixed. Also added comments
before literal arguments.
[CI][format] Explicitly pass extensions to git-clang-format (llvm#95794)
This ensures that the CI script controls which file extensions are
considered instead of letting git-clang-format apply its own filtering
rules. In particular, this properly handles libc++ extension-less
headers which were passed to git-clang-format, but then dropped by that
tool as having an unrecognized extension.
[HWASan] [compiler-rt] support non-4k pages on Android (llvm#95069)
Reapply "[mlir][sparse] implement lowering rules for IterateOp." (llvm#95836)
Reland [mlir][Target] Improve ROCDL gpu serialization API (llvm#95813)
Reland: llvm#95456
This patch improves the ROCDL gpu serialization API by:
AMDGCNLibraries
for specifying the AMD GCNdevice code libraries to use during linking.
getCommonBitcodeLibs
in favor ofAMDGCNLibraries
.Previously
getCommonBitcodeLibs
would try to load all AMD GCN bitcodelibrariesm now it will only load the requested libraries.
compileToBinary
method and making it virtual, allowingdownstream users to re-use this method.
moduleToObjectImpl
, this method provides a prototype flowfor compiling to binary, allowing downstream users to re-use this
method.
libraries are being used.
stops.
was required.
[libc] Only include getauxval on AARCH64 targets (llvm#95844)
Summary:
Not all platforms support this function or header, but it was being
included by every test. Move it inside of the
ifdef
for the only user,which is aarch64.
[libc][stdlib] Only add internal malloc in full build mode. Use the system malloc in overlay mode. (llvm#95845)
This causes an issue in overlay mode:
llvm#95736 (comment)
[scudo] Update error handling for seondary cache entry count (llvm#95595)
Initially, the scudo allocator would return an error if the user
attempted to set the cache capacity
(i.e. the number of possible entries in the cache) above the maximum
cache capacity.
Now the allocator will resort to using the maximum cache capacity in
this event.
An error will still be returned if the user attempts to set the number
of entries to a negative value.
[mlgo] inline for size: add bypass mechanism for perserving performance (llvm#95616)
This allows shrinking for size the cold part of the code, without sacrificing performance.
Revert "Reland [mlir][Target] Improve ROCDL gpu serialization API" (llvm#95847)
Reverts llvm#95813
Revert "[HWASan] [compiler-rt] support non-4k pages on Android" (llvm#95853)
Reverts llvm#95069
Broke windows bot
[mlgo] remove
inlining_default
- unused featureThe feature was only exposed for training and was immediately dropped on
the training side. It was bulk-copied into the test model generator, where
it had no effect (the generator always returns a constant).
In the AOT + test model case, since the test model returns a constant, all
input features are pruned by the AOT compiler, so its presence/absence
doesn't matter.
[libc][stdlib] Run freelist_heap_test only in full build mode. (llvm#95850)
[mlir][drr] Fix variadic destination emission (llvm#95855)
Its possible for handleResultPattern to emit helpers, these helpers
cannot be interleaved with pushing into the array. Emit into a separate
string to enable helpers to be emitted before the population of vector.
Signed-off-by: Jacques Pienaar jpienaar@google.com
Fix comments in ValueObjectPrinter.h (NFC)
[BOLT] Drop high discrepancy profiles in matching (llvm#95156)
Summary: Functions with high discrepancy
(measured by matched function blocks)
can be ignored with an added command line
argument for better performance.
Test Plan: Added
stale-matching-min-matched-block.test
Co-authored-by: Amir Ayupov aaupov@fb.com
[lldb] Fix Python interpreter workaround (attempt #2)
On macOS, to make DYLD_INSERT_LIBRARIES and the Python shim work
together, we have a workaroud that copies the "real" Python interpreter
into the build directory. This doesn't work when running in a virtual
environment, as the copied interpreter cannot find the packages
installed in the virtual environment relative to itself.
Address this issue by copying the Python interpreter into the virtual
environment's
bin
folder, rather than the build folder, when the testsuite detects that it's being run inside a virtual environment.
I'm not thrilled about this solution because it puts a file outside the
build directory. However, given virtual environments are considered
disposable, this seems reasonable.
[docs][Security Group] Update Apple representation. (llvm#95491)
Remove Kate; Apple is now represented by Oliver (and myself).
Reapply "[HWASan] [compiler-rt] support non-4k pages on Android" (llvm#95853)
Updated MapDynamicShadow callsite in asan_win.
[Clang][AMDGPU] Add a new builtin type for buffer rsrc