-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve stack usage to increase recursive initialization depth #88546
Conversation
@llvm/pr-subscribers-clang Author: Aaron Ballman (AaronBallman) ChangesWe were crashing due to stack exhaustion on rather reasonable C++ template code. After some investigation, I found that we have a stack-allocated object that was huge: With these change, With the original code, testing RelWithDebInfo on Windows 10, all the tests in SemaCXX took about 6s 800ms. The max template depth I could reach on my machine using the code in the issue was 708. After that, I would get With these changes on the same machine, all the tests in SemaCXX took about 6s 500ms. The max template depth I could reach was 1492. After that, I would get Fixes #88330 Patch is 30.26 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/88546.diff 11 Files Affected:
diff --git a/clang/include/clang/Sema/Initialization.h b/clang/include/clang/Sema/Initialization.h
index 2072cd8d1c3ef8..f41ca97f030066 100644
--- a/clang/include/clang/Sema/Initialization.h
+++ b/clang/include/clang/Sema/Initialization.h
@@ -1134,7 +1134,7 @@ class InitializationSequence {
OverloadingResult FailedOverloadResult;
/// The candidate set created when initialization failed.
- OverloadCandidateSet FailedCandidateSet;
+ OverloadCandidateSet *FailedCandidateSet;
/// The incomplete type that caused a failure.
QualType FailedIncompleteType;
@@ -1403,7 +1403,9 @@ class InitializationSequence {
/// Retrieve a reference to the candidate set when overload
/// resolution fails.
OverloadCandidateSet &getFailedCandidateSet() {
- return FailedCandidateSet;
+ assert(FailedCandidateSet &&
+ "this should have been allocated in the constructor!");
+ return *FailedCandidateSet;
}
/// Get the overloading result, for when the initialization
diff --git a/clang/include/clang/Sema/Overload.h b/clang/include/clang/Sema/Overload.h
index 76311b00d2fc58..40800c7e1af381 100644
--- a/clang/include/clang/Sema/Overload.h
+++ b/clang/include/clang/Sema/Overload.h
@@ -874,7 +874,8 @@ class Sema;
ConversionFixItGenerator Fix;
/// Viable - True to indicate that this overload candidate is viable.
- bool Viable : 1;
+ LLVM_PREFERRED_TYPE(bool)
+ unsigned Viable : 1;
/// Whether this candidate is the best viable function, or tied for being
/// the best viable function.
@@ -883,12 +884,14 @@ class Sema;
/// was part of the ambiguity kernel: the minimal non-empty set of viable
/// candidates such that all elements of the ambiguity kernel are better
/// than all viable candidates not in the ambiguity kernel.
- bool Best : 1;
+ LLVM_PREFERRED_TYPE(bool)
+ unsigned Best : 1;
/// IsSurrogate - True to indicate that this candidate is a
/// surrogate for a conversion to a function pointer or reference
/// (C++ [over.call.object]).
- bool IsSurrogate : 1;
+ LLVM_PREFERRED_TYPE(bool)
+ unsigned IsSurrogate : 1;
/// IgnoreObjectArgument - True to indicate that the first
/// argument's conversion, which for this function represents the
@@ -897,18 +900,20 @@ class Sema;
/// implicit object argument is just a placeholder) or a
/// non-static member function when the call doesn't have an
/// object argument.
- bool IgnoreObjectArgument : 1;
+ LLVM_PREFERRED_TYPE(bool)
+ unsigned IgnoreObjectArgument : 1;
/// True if the candidate was found using ADL.
- CallExpr::ADLCallKind IsADLCandidate : 1;
+ LLVM_PREFERRED_TYPE(CallExpr::ADLCallKind)
+ unsigned IsADLCandidate : 1;
/// Whether this is a rewritten candidate, and if so, of what kind?
LLVM_PREFERRED_TYPE(OverloadCandidateRewriteKind)
unsigned RewriteKind : 2;
/// FailureKind - The reason why this candidate is not viable.
- /// Actually an OverloadFailureKind.
- unsigned char FailureKind;
+ LLVM_PREFERRED_TYPE(OverloadFailureKind)
+ unsigned FailureKind : 5;
/// The number of call arguments that were explicitly provided,
/// to be used while performing partial ordering of function templates.
@@ -972,7 +977,9 @@ class Sema;
private:
friend class OverloadCandidateSet;
OverloadCandidate()
- : IsSurrogate(false), IsADLCandidate(CallExpr::NotADL), RewriteKind(CRK_None) {}
+ : IsSurrogate(false),
+ IsADLCandidate(static_cast<unsigned>(CallExpr::NotADL)),
+ RewriteKind(CRK_None) {}
};
/// OverloadCandidateSet - A set of overload candidates, used in C++
@@ -1070,57 +1077,25 @@ class Sema;
};
private:
- SmallVector<OverloadCandidate, 16> Candidates;
- llvm::SmallPtrSet<uintptr_t, 16> Functions;
-
- // Allocator for ConversionSequenceLists. We store the first few of these
- // inline to avoid allocation for small sets.
- llvm::BumpPtrAllocator SlabAllocator;
+ ASTContext &Ctx;
+ SmallVector<OverloadCandidate, 4> Candidates;
+ llvm::SmallPtrSet<uintptr_t, 4> Functions;
SourceLocation Loc;
CandidateSetKind Kind;
OperatorRewriteInfo RewriteInfo;
- constexpr static unsigned NumInlineBytes =
- 24 * sizeof(ImplicitConversionSequence);
- unsigned NumInlineBytesUsed = 0;
- alignas(void *) char InlineSpace[NumInlineBytes];
-
// Address space of the object being constructed.
LangAS DestAS = LangAS::Default;
- /// If we have space, allocates from inline storage. Otherwise, allocates
- /// from the slab allocator.
- /// FIXME: It would probably be nice to have a SmallBumpPtrAllocator
- /// instead.
- /// FIXME: Now that this only allocates ImplicitConversionSequences, do we
- /// want to un-generalize this?
- template <typename T>
- T *slabAllocate(unsigned N) {
- // It's simpler if this doesn't need to consider alignment.
- static_assert(alignof(T) == alignof(void *),
- "Only works for pointer-aligned types.");
- static_assert(std::is_trivial<T>::value ||
- std::is_same<ImplicitConversionSequence, T>::value,
- "Add destruction logic to OverloadCandidateSet::clear().");
-
- unsigned NBytes = sizeof(T) * N;
- if (NBytes > NumInlineBytes - NumInlineBytesUsed)
- return SlabAllocator.Allocate<T>(N);
- char *FreeSpaceStart = InlineSpace + NumInlineBytesUsed;
- assert(uintptr_t(FreeSpaceStart) % alignof(void *) == 0 &&
- "Misaligned storage!");
-
- NumInlineBytesUsed += NBytes;
- return reinterpret_cast<T *>(FreeSpaceStart);
- }
void destroyCandidates();
public:
- OverloadCandidateSet(SourceLocation Loc, CandidateSetKind CSK,
+ OverloadCandidateSet(ASTContext &Ctx, SourceLocation Loc,
+ CandidateSetKind CSK,
OperatorRewriteInfo RewriteInfo = {})
- : Loc(Loc), Kind(CSK), RewriteInfo(RewriteInfo) {}
+ : Ctx(Ctx), Loc(Loc), Kind(CSK), RewriteInfo(RewriteInfo) {}
OverloadCandidateSet(const OverloadCandidateSet &) = delete;
OverloadCandidateSet &operator=(const OverloadCandidateSet &) = delete;
~OverloadCandidateSet() { destroyCandidates(); }
@@ -1163,7 +1138,7 @@ class Sema;
ConversionSequenceList
allocateConversionSequences(unsigned NumConversions) {
ImplicitConversionSequence *Conversions =
- slabAllocate<ImplicitConversionSequence>(NumConversions);
+ Ctx.Allocate<ImplicitConversionSequence>(NumConversions);
// Construct the new objects.
for (unsigned I = 0; I != NumConversions; ++I)
diff --git a/clang/lib/Sema/SemaCodeComplete.cpp b/clang/lib/Sema/SemaCodeComplete.cpp
index c335017f243eb2..c561fdc93ba00c 100644
--- a/clang/lib/Sema/SemaCodeComplete.cpp
+++ b/clang/lib/Sema/SemaCodeComplete.cpp
@@ -6210,7 +6210,8 @@ QualType Sema::ProduceCallSignatureHelp(Expr *Fn, ArrayRef<Expr *> Args,
Expr *NakedFn = Fn->IgnoreParenCasts();
// Build an overload candidate set based on the functions we find.
SourceLocation Loc = Fn->getExprLoc();
- OverloadCandidateSet CandidateSet(Loc, OverloadCandidateSet::CSK_Normal);
+ OverloadCandidateSet CandidateSet(Context, Loc,
+ OverloadCandidateSet::CSK_Normal);
if (auto ULE = dyn_cast<UnresolvedLookupExpr>(NakedFn)) {
AddOverloadedCallCandidates(ULE, ArgsWithoutDependentTypes, CandidateSet,
@@ -6411,7 +6412,8 @@ QualType Sema::ProduceConstructorSignatureHelp(QualType Type,
// FIXME: Provide support for variadic template constructors.
if (CRD) {
- OverloadCandidateSet CandidateSet(Loc, OverloadCandidateSet::CSK_Normal);
+ OverloadCandidateSet CandidateSet(Context, Loc,
+ OverloadCandidateSet::CSK_Normal);
for (NamedDecl *C : LookupConstructors(CRD)) {
if (auto *FD = dyn_cast<FunctionDecl>(C)) {
// FIXME: we can't yet provide correct signature help for initializer
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 720e56692359b3..de22408cd16a03 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -19073,7 +19073,8 @@ static void ComputeSelectedDestructor(Sema &S, CXXRecordDecl *Record) {
}
SourceLocation Loc = Record->getLocation();
- OverloadCandidateSet OCS(Loc, OverloadCandidateSet::CSK_Normal);
+ OverloadCandidateSet OCS(S.getASTContext(), Loc,
+ OverloadCandidateSet::CSK_Normal);
for (auto *Decl : Record->decls()) {
if (auto *DD = dyn_cast<CXXDestructorDecl>(Decl)) {
diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp
index 51c14443d2d8f1..df6381e2cdf3bc 100644
--- a/clang/lib/Sema/SemaDeclCXX.cpp
+++ b/clang/lib/Sema/SemaDeclCXX.cpp
@@ -8198,7 +8198,8 @@ class DefaultedComparisonAnalyzer
// we've already found there is no viable 'operator<=>' candidate (and are
// considering synthesizing a '<=>' from '==' and '<').
OverloadCandidateSet CandidateSet(
- FD->getLocation(), OverloadCandidateSet::CSK_Operator,
+ S.getASTContext(), FD->getLocation(),
+ OverloadCandidateSet::CSK_Operator,
OverloadCandidateSet::OperatorRewriteInfo(
OO, FD->getLocation(),
/*AllowRewrittenCandidates=*/!SpaceshipCandidates));
@@ -17409,7 +17410,7 @@ bool Sema::EvaluateStaticAssertMessageAsString(Expr *Message,
LookupResult MemberLookup(*this, DN, Loc, Sema::LookupMemberName);
LookupQualifiedName(MemberLookup, RD);
Empty = MemberLookup.empty();
- OverloadCandidateSet Candidates(MemberLookup.getNameLoc(),
+ OverloadCandidateSet Candidates(Context, MemberLookup.getNameLoc(),
OverloadCandidateSet::CSK_Normal);
if (MemberLookup.empty())
return std::nullopt;
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index b294d2bd9f53f2..c14d484378a866 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -2500,7 +2500,7 @@ bool Sema::DiagnoseEmptyLookup(Scope *S, CXXScopeSpec &SS, LookupResult &R,
// If there's a best viable function among the results, only mention
// that one in the notes.
- OverloadCandidateSet Candidates(R.getNameLoc(),
+ OverloadCandidateSet Candidates(Context, R.getNameLoc(),
OverloadCandidateSet::CSK_Normal);
AddOverloadedCallCandidates(R, ExplicitTemplateArgs, Args, Candidates);
OverloadCandidateSet::iterator Best;
@@ -2548,7 +2548,7 @@ bool Sema::DiagnoseEmptyLookup(Scope *S, CXXScopeSpec &SS, LookupResult &R,
NamedDecl *ND = Corrected.getFoundDecl();
if (ND) {
if (Corrected.isOverloaded()) {
- OverloadCandidateSet OCS(R.getNameLoc(),
+ OverloadCandidateSet OCS(Context, R.getNameLoc(),
OverloadCandidateSet::CSK_Normal);
OverloadCandidateSet::iterator Best;
for (NamedDecl *CD : Corrected) {
@@ -6501,7 +6501,8 @@ static TypoCorrection TryTypoCorrectionForCall(Sema &S, Expr *Fn,
Sema::CTK_ErrorRecovery)) {
if (NamedDecl *ND = Corrected.getFoundDecl()) {
if (Corrected.isOverloaded()) {
- OverloadCandidateSet OCS(NameLoc, OverloadCandidateSet::CSK_Normal);
+ OverloadCandidateSet OCS(S.getASTContext(), NameLoc,
+ OverloadCandidateSet::CSK_Normal);
OverloadCandidateSet::iterator Best;
for (NamedDecl *CD : Corrected) {
if (FunctionDecl *FD = dyn_cast<FunctionDecl>(CD))
diff --git a/clang/lib/Sema/SemaExprCXX.cpp b/clang/lib/Sema/SemaExprCXX.cpp
index ce9d5c26e21858..0b7bda6d5f2f25 100644
--- a/clang/lib/Sema/SemaExprCXX.cpp
+++ b/clang/lib/Sema/SemaExprCXX.cpp
@@ -2586,7 +2586,7 @@ static bool resolveAllocationOverload(
Sema &S, LookupResult &R, SourceRange Range, SmallVectorImpl<Expr *> &Args,
bool &PassAlignment, FunctionDecl *&Operator,
OverloadCandidateSet *AlignedCandidates, Expr *AlignArg, bool Diagnose) {
- OverloadCandidateSet Candidates(R.getNameLoc(),
+ OverloadCandidateSet Candidates(S.getASTContext(), R.getNameLoc(),
OverloadCandidateSet::CSK_Normal);
for (LookupResult::iterator Alloc = R.begin(), AllocEnd = R.end();
Alloc != AllocEnd; ++Alloc) {
@@ -3919,7 +3919,7 @@ static bool resolveBuiltinNewDeleteOverload(Sema &S, CallExpr *TheCall,
R.suppressDiagnostics();
SmallVector<Expr *, 8> Args(TheCall->arguments());
- OverloadCandidateSet Candidates(R.getNameLoc(),
+ OverloadCandidateSet Candidates(S.getASTContext(), R.getNameLoc(),
OverloadCandidateSet::CSK_Normal);
for (LookupResult::iterator FnOvl = R.begin(), FnOvlEnd = R.end();
FnOvl != FnOvlEnd; ++FnOvl) {
@@ -6488,7 +6488,7 @@ static bool TryClassUnification(Sema &Self, Expr *From, Expr *To,
static bool FindConditionalOverload(Sema &Self, ExprResult &LHS, ExprResult &RHS,
SourceLocation QuestionLoc) {
Expr *Args[2] = { LHS.get(), RHS.get() };
- OverloadCandidateSet CandidateSet(QuestionLoc,
+ OverloadCandidateSet CandidateSet(Self.getASTContext(), QuestionLoc,
OverloadCandidateSet::CSK_Operator);
Self.AddBuiltinOperatorCandidates(OO_Conditional, QuestionLoc, Args,
CandidateSet);
diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 6b8ce0f633a3ea..883bdece74daf1 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -6113,8 +6113,10 @@ static bool TryOCLZeroOpaqueTypeInitialization(Sema &S,
InitializationSequence::InitializationSequence(
Sema &S, const InitializedEntity &Entity, const InitializationKind &Kind,
MultiExprArg Args, bool TopLevelOfInitList, bool TreatUnavailableAsInvalid)
- : FailedOverloadResult(OR_Success),
- FailedCandidateSet(Kind.getLocation(), OverloadCandidateSet::CSK_Normal) {
+ : FailedOverloadResult(OR_Success), FailedCandidateSet(nullptr) {
+ FailedCandidateSet = S.getASTContext().Allocate<OverloadCandidateSet>();
+ FailedCandidateSet = new (FailedCandidateSet) OverloadCandidateSet(
+ S.getASTContext(), Kind.getLocation(), OverloadCandidateSet::CSK_Normal);
InitializeFrom(S, Entity, Kind, Args, TopLevelOfInitList,
TreatUnavailableAsInvalid);
}
@@ -6808,7 +6810,8 @@ static ExprResult CopyObject(Sema &S,
// Perform overload resolution using the class's constructors. Per
// C++11 [dcl.init]p16, second bullet for class types, this initialization
// is direct-initialization.
- OverloadCandidateSet CandidateSet(Loc, OverloadCandidateSet::CSK_Normal);
+ OverloadCandidateSet CandidateSet(S.getASTContext(), Loc,
+ OverloadCandidateSet::CSK_Normal);
DeclContext::lookup_result Ctors = S.LookupConstructors(Class);
OverloadCandidateSet::iterator Best;
@@ -6949,7 +6952,8 @@ static void CheckCXX98CompatAccessibleCopy(Sema &S,
return;
// Find constructors which would have been considered.
- OverloadCandidateSet CandidateSet(Loc, OverloadCandidateSet::CSK_Normal);
+ OverloadCandidateSet CandidateSet(S.getASTContext(), Loc,
+ OverloadCandidateSet::CSK_Normal);
DeclContext::lookup_result Ctors =
S.LookupConstructors(cast<CXXRecordDecl>(Record->getDecl()));
@@ -9735,7 +9739,7 @@ bool InitializationSequence::Diagnose(Sema &S,
switch (FailedOverloadResult) {
case OR_Ambiguous:
- FailedCandidateSet.NoteCandidates(
+ FailedCandidateSet->NoteCandidates(
PartialDiagnosticAt(
Kind.getLocation(),
Failure == FK_UserConversionOverloadFailed
@@ -9749,7 +9753,8 @@ bool InitializationSequence::Diagnose(Sema &S,
break;
case OR_No_Viable_Function: {
- auto Cands = FailedCandidateSet.CompleteCandidates(S, OCD_AllCandidates, Args);
+ auto Cands =
+ FailedCandidateSet->CompleteCandidates(S, OCD_AllCandidates, Args);
if (!S.RequireCompleteType(Kind.getLocation(),
DestType.getNonReferenceType(),
diag::err_typecheck_nonviable_condition_incomplete,
@@ -9759,7 +9764,7 @@ bool InitializationSequence::Diagnose(Sema &S,
<< OnlyArg->getType() << Args[0]->getSourceRange()
<< DestType.getNonReferenceType();
- FailedCandidateSet.NoteCandidates(S, Args, Cands);
+ FailedCandidateSet->NoteCandidates(S, Args, Cands);
break;
}
case OR_Deleted: {
@@ -9768,7 +9773,7 @@ bool InitializationSequence::Diagnose(Sema &S,
<< Args[0]->getSourceRange();
OverloadCandidateSet::iterator Best;
OverloadingResult Ovl
- = FailedCandidateSet.BestViableFunction(S, Kind.getLocation(), Best);
+ = FailedCandidateSet->BestViableFunction(S, Kind.getLocation(), Best);
if (Ovl == OR_Deleted) {
S.NoteDeletedFunction(Best->Function);
} else {
@@ -9946,7 +9951,7 @@ bool InitializationSequence::Diagnose(Sema &S,
// bad.
switch (FailedOverloadResult) {
case OR_Ambiguous:
- FailedCandidateSet.NoteCandidates(
+ FailedCandidateSet->NoteCandidates(
PartialDiagnosticAt(Kind.getLocation(),
S.PDiag(diag::err_ovl_ambiguous_init)
<< DestType << ArgsRange),
@@ -10000,7 +10005,7 @@ bool InitializationSequence::Diagnose(Sema &S,
break;
}
- FailedCandidateSet.NoteCandidates(
+ FailedCandidateSet->NoteCandidates(
PartialDiagnosticAt(
Kind.getLocation(),
S.PDiag(diag::err_ovl_no_viable_function_in_init)
@@ -10011,7 +10016,7 @@ bool InitializationSequence::Diagnose(Sema &S,
case OR_Deleted: {
OverloadCandidateSet::iterator Best;
OverloadingResult Ovl
- = FailedCandidateSet.BestViableFunction(S, Kind.getLocation(), Best);
+ = FailedCandidateSet->BestViableFunction(S, Kind.getLocation(), Best);
if (Ovl != OR_Deleted) {
S.Diag(Kind.getLocation(), diag::err_ovl_deleted_init)
<< DestType << ArgsRange;
@@ -10088,7 +10093,7 @@ bool InitializationSequence::Diagnose(Sema &S,
<< Args[0]->getSourceRange();
OverloadCandidateSet::iterator Best;
OverloadingResult Ovl
- = FailedCandidateSet.BestViableFunction(S, Kind.getLocation(), Best);
+ = FailedCandidateSet->BestViableFunction(S, Kind.getLocation(), Best);
(void)Ovl;
assert(Ovl == OR_Success && "Inconsistent overload resolution");
CXXConstructorDecl *CtorDecl = cast<CXXConstructorDecl>(Best->Function);
@@ -10818,7 +10823,7 @@ QualType Sema::DeduceTemplateSpecializationFromInitializer(
//
// Since we know we're initializing a class type of a type unrelated to that
// of the initializer, this reduces to something fairly reasonable.
- OverloadCandidateSet Candidates(Kind.getLocation(),
+ OverloadCandidateSet Candidates(Context, Kind.getLocation(),
OverloadCandidateSet::CSK_Normal);
OverloadCandidateSet::iterator Best;
diff --git a/clang/lib/Sema/SemaLookup.cpp b/clang/lib/Sema/SemaLookup.cpp
index d65f52b8efe81f..eedba5c1394770 100644
--- a/clang/lib/Sema/SemaLookup.cpp
+++ b/clang/lib/Sema/SemaLookup.cpp
@@ -3481,7 +3481,7 @@ Sema::LookupSpecialMember(CXXRecordDecl *RD, CXXSpecialMemberKind SM,
// Now we perform lookup on the name we computed earlier and do overload
// resolution. Lookup is only performed directly into the class since there
// will always be a (possibly implicit) declaration to shadow any others.
- OverloadCandidateSet OCS(LookupLoc, OverloadCandidateSet::CSK_Normal);
+ OverloadCandidateSet OCS(Context, LookupLoc, OverloadCandidateSet::CSK_Normal);
DeclContext::lookup_result R = RD->lookup(Name);
if (R.empty()) {
diff --git a/clang/lib/Sema/SemaOverload.cpp b/clang/lib/Sema/SemaOverload.cpp
index e1155dc2d5d285..e94db8aee144b3 100644
--- a/clang/lib/Sema/SemaOverload.cpp
+++ b/clang/lib/Sema/SemaOverload.cpp
@@ -1057,7 +1057,8 @@ bool OverloadCandidateSet::OperatorRewriteInfo::shouldAddReversed(
void Overlo...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
clang/lib/Sema/SemaInit.cpp
Outdated
FailedCandidateSet(Kind.getLocation(), OverloadCandidateSet::CSK_Normal) { | ||
: FailedOverloadResult(OR_Success), FailedCandidateSet(nullptr) { | ||
FailedCandidateSet = S.getASTContext().Allocate<OverloadCandidateSet>(); | ||
FailedCandidateSet = new (FailedCandidateSet) OverloadCandidateSet( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a future optimization, I wonder if we could defer this allocation until we already know that we've failed? Perhaps in SetOverloadFailure
? Not sure I can grok the analysis at the moment to know how possible this is though.
Co-authored-by: Erich Keane <ekeane@nvidia.com>
Co-authored-by: Erich Keane <ekeane@nvidia.com>
…/llvm-project into aballman-stack-size
The precommit CI failures are real; I'm still investigating what's going sideways. |
It turns out this is important for some strange reason. ;-) Many thanks to @erichkeane and @cor3ntin for helping me solve this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a roller coaster!
LGTM
These don't need to be long-lived and there's likely to be a lot of overload and initialization code in a TU, so this should hopefully reduce memory pressure from the heap allocations. It also simplifies the patch by not needing to pass around an ASTContext.
A new memory leak https://lab.llvm.org/buildbot/#/builders/5/builds/42694 |
This change seems to cause significant compile-time regressions for C++ code (https://llvm-compile-time-tracker.com/compare.php?from=184ba038ac1d444980b3e554b0057f3f30c516ab&to=4082a7554521572a65a5a0008c4661a534df659d&stat=instructions%3Au). Probably most damning are the times for the clang build itself (https://llvm-compile-time-tracker.com/compare_clang.php?from=184ba038ac1d444980b3e554b0057f3f30c516ab&to=4082a7554521572a65a5a0008c4661a534df659d&stat=instructions%3Au&sortBy=interestingness) where files like X86ISelDAGToDAG.cpp regress by >3% even in an optimized build. |
for (unsigned I = 0; I != NumConversions; ++I) | ||
new (&Conversions[I]) ImplicitConversionSequence(); | ||
|
||
new ImplicitConversionSequence[NumConversions]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems unusual to keep ownership in ArrayRef.
Could we make ConversionSequenceList just a std::vector if it owns the memory?
@AaronBallman ping |
I've not been able to get back on to this, unfortunately. I'm not certain when I'll have time, either. |
It might be worth reopening the PR and asking for help? otherwise we'll forget And some things that should not have been forgotten were lost. |
We were crashing due to stack exhaustion on rather reasonable C++ template code. After some investigation, I found that we have a stack-allocated object that was huge:
InitializationSequence
was 7016 bytes. This caused an overflow with deep call stacks in initialization code.With these change,
InitializationSequence
is now 248 bytes.With the original code, testing RelWithDebInfo on Windows 10, all the tests in SemaCXX took about 6s 800ms. The max template depth I could reach on my machine using the code in the issue was 708. After that, I would get
-Wstack-exhausted
warnings until crashing at 976 instantiations.With these changes on the same machine, all the tests in SemaCXX took about 6s 500ms. The max template depth I could reach was 1492. After that, I would get
-Wstack-exhausted
warnings until crashing at 2898 instantiations.This improves the behavior of #88330 but there's still an outstanding question of why we run out of stack space and crash in some circumstances before we're able to issue a diagnostic about stack space exhaustion.