-
Notifications
You must be signed in to change notification settings - Fork 13.3k
[analyzer] Backport performace regression fix #89725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-static-analyzer-1 Author: Balazs Benics (steakhal) ChangesFixes #89045 We agreed that this slowdown was so serious, that it's basically equivalent with a crash by blocking analysis. In this commit we fixed the running time regression, and we want it in clang-18 too. FYI We don't have a test for this as this only affects running times. Full diff: https://github.com/llvm/llvm-project/pull/89725.diff 1 Files Affected:
diff --git a/clang/lib/StaticAnalyzer/Checkers/Taint.cpp b/clang/lib/StaticAnalyzer/Checkers/Taint.cpp
index 4edb671753bf45..6362c82b009d72 100644
--- a/clang/lib/StaticAnalyzer/Checkers/Taint.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/Taint.cpp
@@ -216,21 +216,17 @@ std::vector<SymbolRef> taint::getTaintedSymbolsImpl(ProgramStateRef State,
std::vector<SymbolRef> TaintedSymbols;
if (!Reg)
return TaintedSymbols;
- // Element region (array element) is tainted if either the base or the offset
- // are tainted.
+
+ // Element region (array element) is tainted if the offset is tainted.
if (const ElementRegion *ER = dyn_cast<ElementRegion>(Reg)) {
std::vector<SymbolRef> TaintedIndex =
getTaintedSymbolsImpl(State, ER->getIndex(), K, returnFirstOnly);
llvm::append_range(TaintedSymbols, TaintedIndex);
if (returnFirstOnly && !TaintedSymbols.empty())
return TaintedSymbols; // return early if needed
- std::vector<SymbolRef> TaintedSuperRegion =
- getTaintedSymbolsImpl(State, ER->getSuperRegion(), K, returnFirstOnly);
- llvm::append_range(TaintedSymbols, TaintedSuperRegion);
- if (returnFirstOnly && !TaintedSymbols.empty())
- return TaintedSymbols; // return early if needed
}
+ // Symbolic region is tainted if the corresponding symbol is tainted.
if (const SymbolicRegion *SR = dyn_cast<SymbolicRegion>(Reg)) {
std::vector<SymbolRef> TaintedRegions =
getTaintedSymbolsImpl(State, SR->getSymbol(), K, returnFirstOnly);
@@ -239,6 +235,8 @@ std::vector<SymbolRef> taint::getTaintedSymbolsImpl(ProgramStateRef State,
return TaintedSymbols; // return early if needed
}
+ // Any subregion (including Element and Symbolic regions) is tainted if its
+ // super-region is tainted.
if (const SubRegion *ER = dyn_cast<SubRegion>(Reg)) {
std::vector<SymbolRef> TaintedSubRegions =
getTaintedSymbolsImpl(State, ER->getSuperRegion(), K, returnFirstOnly);
@@ -318,4 +316,4 @@ std::vector<SymbolRef> taint::getTaintedSymbolsImpl(ProgramStateRef State,
}
}
return TaintedSymbols;
-}
\ No newline at end of file
+}
|
Previously the function ``` std::vector<SymbolRef> taint::getTaintedSymbolsImpl(ProgramStateRef State, const MemRegion *Reg, TaintTagType K, bool returnFirstOnly) ``` (one of the 4 overloaded variants under this name) was handling element regions in a highly inefficient manner: it performed the "also examine the super-region" step twice. (Once in the branch for element regions, and once in the more general branch for all `SubRegion`s -- note that `ElementRegion` is a subclass of `SubRegion`.) As pointer arithmetic produces `ElementRegion`s, it's not too difficult to get a chain of N nested element regions where this inefficient recursion would produce 2^N calls. This commit is essentially NFC, apart from the performance improvements and the removal of (probably irrelevant) duplicate entries from the return value of `getTaintedSymbols()` calls. Fixes #89045 (cherry picked from commit ce763bf)
9097b19
to
3b4ba72
Compare
Hi @steakhal (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. |
I think this would work: "In previous dot releases, we had a critical slowdown on analyzing code hashing or doing many array accesses. This bug did not affect previous major releases. See the details at issue #89045." |
Fixes #89045
(cherry picked from commit ce763bf)
We agreed that this slowdown was so serious, that it's basically equivalent with a crash by blocking analysis.
See the comment here from the code-owner.
In this commit we fixed the running time regression, and we want it in clang-18 too.
FYI We don't have a test for this as this only affects running times.
We also don't have an entry for this in the ReleaseNotes, as this regression was introduced by clang-18.