-
Notifications
You must be signed in to change notification settings - Fork 1
[Flang] Loop Interchange #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
PR llvm#153488 caused the msvc build (https://lab.llvm.org/buildbot/#/builders/166/builds/1397) to fail: ``` ..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): error C2668: 'Fortran::evaluate::rewrite::Identity::operator ()': ambiguous call to overloaded function ..\llvm-project\flang\include\flang/Evaluate/rewrite.h(43): note: could be 'Fortran::evaluate::Expr<Fortran::evaluate::SomeType> Fortran::evaluate::rewrite::Identity::operator ()<Fortran::evaluate::SomeType,S>(Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &&,const U &)' with [ S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>, U=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128> ] ..\llvm-project\flang\lib\Semantics\check-omp-atomic.cpp(174): note: or 'Fortran::evaluate::Expr<Fortran::evaluate::SomeType> Fortran::semantics::ReassocRewriter::operator ()<Fortran::evaluate::SomeType,S,void>(Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &&,const U &,Fortran::semantics::ReassocRewriter::NonIntegralTag)' with [ S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>, U=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128> ] ..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): note: while trying to match the argument list '(Fortran::evaluate::Expr<Fortran::evaluate::SomeType>, const S)' with [ S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128> ] ..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): note: the template instantiation context (the oldest one first) is ..\llvm-project\flang\lib\Semantics\check-omp-atomic.cpp(814): note: see reference to function template instantiation 'U Fortran::evaluate::rewrite::Mutator<Fortran::semantics::ReassocRewriter>::operator ()<const Fortran::evaluate::Expr<Fortran::evaluate::SomeType>&,Fortran::evaluate::Expr<Fortran::evaluate::SomeType>>(T)' being compiled with [ U=Fortran::evaluate::Expr<Fortran::evaluate::SomeType>, T=const Fortran::evaluate::Expr<Fortran::evaluate::SomeType> & ] ``` The reason is that there is an ambiguite between operator() of ReassocRewriter itself as operator() of the base class Identity through `using Id::operator();`. By the C++ specification, method declarations in ReassocRewriter hide methods with the same signature from a using declaration, but this does not apply to ``` evaluate::Expr<T> operator()(..., NonIntegralTag = {}) ``` which has a different signature due to an additional tag parameter. Since it has a default value, it is ambiguous with operator() without tag parameter. GCC and Clang both accept this, but in my understanding MSVC is correct here. Since the overloads of ReassocRewriter cover all cases, remopving the using reclarations to avoid the ambiguity.
…d the tile sizes through the parse tree when getting the information needed to create the loop nest ops.
…nested loop constructs.
…single nested loop construct, which is what we prefer.
| source .venv/bin/activate && cd repo | ||
| python -m pip install -r libcxx/utils/requirements.txt | ||
| baseline_commit=$(git merge-base ${{ steps.vars.outputs.pr_base }} ${{ steps.vars.outputs.pr_head }}) | ||
| ./libcxx/utils/test-at-commit --commit ${baseline_commit} -B build/baseline -- -sv -j1 --param optimization=speed ${{ steps.vars.outputs.benchmarks }} |
Check failure
Code scanning / CodeQL
Code injection Critical
${ steps.vars.outputs.benchmarks }
issue_comment
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
To fix the code injection risk, we must change how the $BENCHMARKS data (which is environment variable-bound from user input) is consumed in the workflow steps that run shell commands. The best practice is to avoid direct usage of ${{ steps.vars.outputs.benchmarks }} in the shell lines, instead setting an environment variable (as is already done with other values) and using native shell variable syntax with proper quoting: "$BENCHMARKS".
Concretely, in all run: commands where ${{ steps.vars.outputs.benchmarks }} appears, refactor the step to set benchmarks: ${{ steps.vars.outputs.benchmarks }} in the env: section, and then reference it inside the run: | block using "$benchmarks" (the value is inherited as a lower-case shell variable). Make sure to quote the variable correctly inside shell commands to prevent injection via whitespace or metacharacters.
Edit the block (line 71 and line 77) accordingly.
No additional method definitions or helper scripts are needed. There is no need to install any new dependencies.
-
Copy modified lines R67-R68 -
Copy modified line R73 -
Copy modified lines R77-R78 -
Copy modified line R81
| @@ -64,17 +64,21 @@ | ||
| path: repo # Avoid nuking the workspace, where we have the Python virtualenv | ||
|
|
||
| - name: Run baseline | ||
| env: | ||
| benchmarks: ${{ steps.vars.outputs.benchmarks }} | ||
| run: | | ||
| source .venv/bin/activate && cd repo | ||
| python -m pip install -r libcxx/utils/requirements.txt | ||
| baseline_commit=$(git merge-base ${{ steps.vars.outputs.pr_base }} ${{ steps.vars.outputs.pr_head }}) | ||
| ./libcxx/utils/test-at-commit --commit ${baseline_commit} -B build/baseline -- -sv -j1 --param optimization=speed ${{ steps.vars.outputs.benchmarks }} | ||
| ./libcxx/utils/test-at-commit --commit ${baseline_commit} -B build/baseline -- -sv -j1 --param optimization=speed "$benchmarks" | ||
| ./libcxx/utils/consolidate-benchmarks build/baseline | tee baseline.lnt | ||
|
|
||
| - name: Run candidate | ||
| env: | ||
| benchmarks: ${{ steps.vars.outputs.benchmarks }} | ||
| run: | | ||
| source .venv/bin/activate && cd repo | ||
| ./libcxx/utils/test-at-commit --commit ${{ steps.vars.outputs.pr_head }} -B build/candidate -- -sv -j1 --param optimization=speed ${{ steps.vars.outputs.benchmarks }} | ||
| ./libcxx/utils/test-at-commit --commit ${{ steps.vars.outputs.pr_head }} -B build/candidate -- -sv -j1 --param optimization=speed "$benchmarks" | ||
| ./libcxx/utils/consolidate-benchmarks build/candidate | tee candidate.lnt | ||
|
|
||
| - name: Compare baseline and candidate runs |
| - name: Run candidate | ||
| run: | | ||
| source .venv/bin/activate && cd repo | ||
| ./libcxx/utils/test-at-commit --commit ${{ steps.vars.outputs.pr_head }} -B build/candidate -- -sv -j1 --param optimization=speed ${{ steps.vars.outputs.benchmarks }} |
Check failure
Code scanning / CodeQL
Code injection Critical
${ steps.vars.outputs.benchmarks }
issue_comment
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
To fix the code injection vulnerability, we need to ensure the untrusted input (${{ steps.vars.outputs.benchmarks }}) is not interpolated directly into the shell in the run: block. Instead, assign it to an environment variable in the workflow step (env:), and then refer to it in the shell using normal shell variable syntax ($BENCHMARKS). The change should be made in the step starting at line 75 (Run candidate), specifically on line 77. For defense-in-depth, consider quoting appropriately around the variable if it is used as a command argument to prevent further injection.
This requires:
- Moving the
${{ steps.vars.outputs.benchmarks }}interpolation from command arguments into an environment variable (env: BENCHMARKS: ${{ steps.vars.outputs.benchmarks }}). - Changing the shell command to use
$BENCHMARKSas the argument (quoting it if possible). - No additional package imports or modifications necessary outside of these lines.
-
Copy modified lines R75-R76 -
Copy modified line R79
| @@ -72,9 +72,11 @@ | ||
| ./libcxx/utils/consolidate-benchmarks build/baseline | tee baseline.lnt | ||
|
|
||
| - name: Run candidate | ||
| env: | ||
| BENCHMARKS: ${{ steps.vars.outputs.benchmarks }} | ||
| run: | | ||
| source .venv/bin/activate && cd repo | ||
| ./libcxx/utils/test-at-commit --commit ${{ steps.vars.outputs.pr_head }} -B build/candidate -- -sv -j1 --param optimization=speed ${{ steps.vars.outputs.benchmarks }} | ||
| ./libcxx/utils/test-at-commit --commit ${{ steps.vars.outputs.pr_head }} -B build/candidate -- -sv -j1 --param optimization=speed "$BENCHMARKS" | ||
| ./libcxx/utils/consolidate-benchmarks build/candidate | tee candidate.lnt | ||
|
|
||
| - name: Compare baseline and candidate runs |
The test makes use of the preprocessor, which requires a .F90 suffix
A recent change adding a new sanitizer kind (via Sanitizers.def) was reverted in c74fa20 ("Revert "[Clang][CodeGen] Introduce the AllocToken SanitizerKind" (llvm#162413)"). The reason was this ASan report, when running the test cases in clang/test/Preprocessor/print-header-json.c: ``` ==clang==483265==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7d82b97e8b58 at pc 0x562cd432231f bp 0x7fff3fad0850 sp 0x7fff3fad0848 READ of size 16 at 0x7d82b97e8b58 thread T0 #0 0x562cd432231e in __copy_non_overlapping_range<const unsigned long *, const unsigned long *> zorg-test/libcxx_install_asan_ubsan/include/c++/v1/string:2144:38 #1 0x562cd432231e in void std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__init_with_size[abi:nn220000]<unsigned long const*, unsigned long const*>(unsigned long const*, unsigned long const*, unsigned long) zorg-test/libcxx_install_asan_ubsan/include/c++/v1/string:2685:18 #2 0x562cd41e2797 in __init<const unsigned long *, 0> zorg-test/libcxx_install_asan_ubsan/include/c++/v1/string:2673:3 #3 0x562cd41e2797 in basic_string<const unsigned long *, 0> zorg-test/libcxx_install_asan_ubsan/include/c++/v1/string:1174:5 #4 0x562cd41e2797 in clang::ASTReader::ReadString(llvm::SmallVectorImpl<unsigned long> const&, unsigned int&) clang/lib/Serialization/ASTReader.cpp:10171:15 #5 0x562cd41fd89a in clang::ASTReader::ParseLanguageOptions(llvm::SmallVector<unsigned long, 64u> const&, llvm::StringRef, bool, clang::ASTReaderListener&, bool) clang/lib/Serialization/ASTReader.cpp:6475:28 llvm#6 0x562cd41eea53 in clang::ASTReader::ReadOptionsBlock(llvm::BitstreamCursor&, llvm::StringRef, unsigned int, bool, clang::ASTReaderListener&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) clang/lib/Serialization/ASTReader.cpp:3069:11 llvm#7 0x562cd4204ab8 in clang::ASTReader::ReadControlBlock(clang::serialization::ModuleFile&, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, clang::serialization::ModuleFile const*, unsigned int) clang/lib/Serialization/ASTReader.cpp:3249:15 llvm#8 0x562cd42097d2 in clang::ASTReader::ReadASTCore(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, clang::serialization::ModuleFile*, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, long, long, clang::ASTFileSignature, unsigned int) clang/lib/Serialization/ASTReader.cpp:5182:15 llvm#9 0x562cd421ec77 in clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, clang::serialization::ModuleFile**) clang/lib/Serialization/ASTReader.cpp:4828:11 llvm#10 0x562cd3d07b74 in clang::CompilerInstance::findOrCompileModuleAndReadAST(llvm::StringRef, clang::SourceLocation, clang::SourceLocation, bool) clang/lib/Frontend/CompilerInstance.cpp:1805:27 llvm#11 0x562cd3d0b2ef in clang::CompilerInstance::loadModule(clang::SourceLocation, llvm::ArrayRef<clang::IdentifierLoc>, clang::Module::NameVisibilityKind, bool) clang/lib/Frontend/CompilerInstance.cpp:1956:31 llvm#12 0x562cdb04eb1c in clang::Preprocessor::HandleHeaderIncludeOrImport(clang::SourceLocation, clang::Token&, clang::Token&, clang::SourceLocation, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) clang/lib/Lex/PPDirectives.cpp:2423:49 llvm#13 0x562cdb042222 in clang::Preprocessor::HandleIncludeDirective(clang::SourceLocation, clang::Token&, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) clang/lib/Lex/PPDirectives.cpp:2101:17 llvm#14 0x562cdb043366 in clang::Preprocessor::HandleDirective(clang::Token&) clang/lib/Lex/PPDirectives.cpp:1338:14 llvm#15 0x562cdafa84bc in clang::Lexer::LexTokenInternal(clang::Token&, bool) clang/lib/Lex/Lexer.cpp:4512:7 llvm#16 0x562cdaf9f20b in clang::Lexer::Lex(clang::Token&) clang/lib/Lex/Lexer.cpp:3729:24 llvm#17 0x562cdb0d4ffa in clang::Preprocessor::Lex(clang::Token&) clang/lib/Lex/Preprocessor.cpp:896:11 llvm#18 0x562cd77da950 in clang::ParseAST(clang::Sema&, bool, bool) clang/lib/Parse/ParseAST.cpp:163:7 [...] 0x7d82b97e8b58 is located 0 bytes after 3288-byte region [0x7d82b97e7e80,0x7d82b97e8b58) allocated by thread T0 here: #0 0x562cca76f604 in malloc zorg-test/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:67:3 #1 0x562cd1cce452 in safe_malloc llvm/include/llvm/Support/MemAlloc.h:26:18 #2 0x562cd1cce452 in llvm::SmallVectorBase<unsigned int>::grow_pod(void*, unsigned long, unsigned long) llvm/lib/Support/SmallVector.cpp:151:15 #3 0x562cdbe1768b in grow_pod llvm/include/llvm/ADT/SmallVector.h:139:11 #4 0x562cdbe1768b in grow llvm/include/llvm/ADT/SmallVector.h:525:41 #5 0x562cdbe1768b in reserve llvm/include/llvm/ADT/SmallVector.h:665:13 llvm#6 0x562cdbe1768b in llvm::BitstreamCursor::readRecord(unsigned int, llvm::SmallVectorImpl<unsigned long>&, llvm::StringRef*) llvm/lib/Bitstream/Reader/BitstreamReader.cpp:230:10 llvm#7 0x562cd41ee8ab in clang::ASTReader::ReadOptionsBlock(llvm::BitstreamCursor&, llvm::StringRef, unsigned int, bool, clang::ASTReaderListener&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) clang/lib/Serialization/ASTReader.cpp:3060:49 llvm#8 0x562cd4204ab8 in clang::ASTReader::ReadControlBlock(clang::serialization::ModuleFile&, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, clang::serialization::ModuleFile const*, unsigned int) clang/lib/Serialization/ASTReader.cpp:3249:15 llvm#9 0x562cd42097d2 in clang::ASTReader::ReadASTCore(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, clang::serialization::ModuleFile*, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, long, long, clang::ASTFileSignature, unsigned int) clang/lib/Serialization/ASTReader.cpp:5182:15 llvm#10 0x562cd421ec77 in clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, clang::serialization::ModuleFile**) clang/lib/Serialization/ASTReader.cpp:4828:11 llvm#11 0x562cd3d07b74 in clang::CompilerInstance::findOrCompileModuleAndReadAST(llvm::StringRef, clang::SourceLocation, clang::SourceLocation, bool) clang/lib/Frontend/CompilerInstance.cpp:1805:27 llvm#12 0x562cd3d0b2ef in clang::CompilerInstance::loadModule(clang::SourceLocation, llvm::ArrayRef<clang::IdentifierLoc>, clang::Module::NameVisibilityKind, bool) clang/lib/Frontend/CompilerInstance.cpp:1956:31 llvm#13 0x562cdb04eb1c in clang::Preprocessor::HandleHeaderIncludeOrImport(clang::SourceLocation, clang::Token&, clang::Token&, clang::SourceLocation, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) clang/lib/Lex/PPDirectives.cpp:2423:49 llvm#14 0x562cdb042222 in clang::Preprocessor::HandleIncludeDirective(clang::SourceLocation, clang::Token&, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) clang/lib/Lex/PPDirectives.cpp:2101:17 llvm#15 0x562cdb043366 in clang::Preprocessor::HandleDirective(clang::Token&) clang/lib/Lex/PPDirectives.cpp:1338:14 llvm#16 0x562cdafa84bc in clang::Lexer::LexTokenInternal(clang::Token&, bool) clang/lib/Lex/Lexer.cpp:4512:7 llvm#17 0x562cdaf9f20b in clang::Lexer::Lex(clang::Token&) clang/lib/Lex/Lexer.cpp:3729:24 llvm#18 0x562cdb0d4ffa in clang::Preprocessor::Lex(clang::Token&) clang/lib/Lex/Preprocessor.cpp:896:11 llvm#19 0x562cd77da950 in clang::ParseAST(clang::Sema&, bool, bool) clang/lib/Parse/ParseAST.cpp:163:7 [...] SUMMARY: AddressSanitizer: heap-buffer-overflow clang/lib/Serialization/ASTReader.cpp:10171:15 in clang::ASTReader::ReadString(llvm::SmallVectorImpl<unsigned long> const&, unsigned int&) ``` The reason is this particular RUN line: ``` // RUN: env CC_PRINT_HEADERS_FORMAT=json CC_PRINT_HEADERS_FILTERING=direct-per-file CC_PRINT_HEADERS_FILE=%t.txt %clang -fsyntax-only -I %S/Inputs/print-header-json -isystem %S/Inputs/print-header-json/system -fmodules -fimplicit-module-maps -fmodules-cache-path=%t %s -o /dev/null ``` which was added in 8df194f ("[Clang] Support includes translated to module imports in -header-include-filtering=direct-per-file (llvm#156756)"). The problem is caused by an incremental build reusing stale cached module files (.pcm) that are no longer binary-compatible with the updated compiler. Adding a new sanitizer option altered the implicit binary layout of the serialized LangOptions data structure. The build + test system is oblivious to such changes. When the new compiler attempted to read the old module file (from the previous test invocation), it misinterpreted the data due to the layout mismatch, resulting in a heap-buffer-overflow. Unfortunately Clang's PCM format does not encode nor detect version mismatches here; a more graceful failure mode would be preferable. For now, fix the test to be more robust with incremental build + test.
**Mitigation for:** google/sanitizers#749 **Disclosure:** I'm not an ASan compiler expert yet (I'm trying to learn!), I primarily work in the runtime. Some of this PR was developed with the help of AI tools (primarily as a "fuzzy `grep` engine"), but I've manually refined and tested the output, and can speak for every line. In general, I used it only to orient myself and for "rubberducking". **Context:** The msvc ASan team (👋 ) has received an internal request to improve clang's exception handling under ASan for Windows. Namely, we're interested in **mitigating** this bug: google/sanitizers#749 To summarize, today, clang + ASan produces a false-positive error for this program: ```C++ #include <cstdio> #include <exception> int main() { try { throw std::exception("test"); }catch (const std::exception& ex){ puts(ex.what()); } return 0; } ``` The error reads as such: ``` C:\Users\dajusto\source\repros\upstream>type main.cpp #include <cstdio> #include <exception> int main() { try { throw std::exception("test"); }catch (const std::exception& ex){ puts(ex.what()); } return 0; } C:\Users\dajusto\source\repros\upstream>"C:\Users\dajusto\source\repos\llvm-project\build.runtimes\bin\clang.exe" -fsanitize=address -g -O0 main.cpp C:\Users\dajusto\source\repros\upstream>a.exe ================================================================= ==19112==ERROR: AddressSanitizer: access-violation on unknown address 0x000000000000 (pc 0x7ff72c7c11d9 bp 0x0080000ff960 sp 0x0080000fcf50 T0) ==19112==The signal is caused by a READ memory access. ==19112==Hint: address points to the zero page. #0 0x7ff72c7c11d8 in main C:\Users\dajusto\source\repros\upstream\main.cpp:8 #1 0x7ff72c7d479f in _CallSettingFrame C:\repos\msvc\src\vctools\crt\vcruntime\src\eh\amd64\handlers.asm:49 #2 0x7ff72c7c8944 in __FrameHandler3::CxxCallCatchBlock(struct _EXCEPTION_RECORD *) C:\repos\msvc\src\vctools\crt\vcruntime\src\eh\frame.cpp:1567 #3 0x7ffb4a90e3e5 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18012e3e5) #4 0x7ff72c7c1128 in main C:\Users\dajusto\source\repros\upstream\main.cpp:6 #5 0x7ff72c7c33db in invoke_main C:\repos\msvc\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78 llvm#6 0x7ff72c7c33db in __scrt_common_main_seh C:\repos\msvc\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288 llvm#7 0x7ffb49b05c06 (C:\WINDOWS\System32\KERNEL32.DLL+0x180035c06) llvm#8 0x7ffb4a8455ef (C:\WINDOWS\SYSTEM32\ntdll.dll+0x1800655ef) ==19112==Register values: rax = 0 rbx = 80000ff8e0 rcx = 27d76d00000 rdx = 80000ff8e0 rdi = 80000fdd50 rsi = 80000ff6a0 rbp = 80000ff960 rsp = 80000fcf50 r8 = 100 r9 = 19930520 r10 = 8000503a90 r11 = 80000fd540 r12 = 80000fd020 r13 = 0 r14 = 80000fdeb8 r15 = 0 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: access-violation C:\Users\dajusto\source\repros\upstream\main.cpp:8 in main ==19112==ABORTING ``` The root of the issue _appears to be_ that ASan's instrumentation is incompatible with Window's assumptions for instantiating `catch`-block's parameters (`ex` in the snippet above). The nitty gritty details are lost on me, but I understand that to make this work without loss of ASan coverage, a "serious" refactoring is needed. In the meantime, users risk false positive errors when pairing ASan + catch-block parameters on Windows. **To mitigate this** I think we should avoid instrumenting catch-block parameters on Windows. It appears to me this is as "simple" as marking catch block parameters as "uninteresting" in `AddressSanitizer::isInterestingAlloca`. My manual tests seem to confirm this. I believe this is strictly better than today's status quo, where the runtime generates false positives. Although we're now explicitly choosing to instrument less, the benefit is that now more programs can run with ASan without _funky_ macros that disable ASan on exception blocks. **This PR:** implements the mitigation above, and creates a simple new test for it. _Thanks!_ --------- Co-authored-by: Antonio Frighetto <me@antoniofrighetto.com>
No description provided.