Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve stack usage to increase recursive initialization depth #88546

Merged
merged 16 commits into from
Apr 16, 2024

Conversation

AaronBallman
Copy link
Collaborator

@AaronBallman AaronBallman commented Apr 12, 2024

We were crashing due to stack exhaustion on rather reasonable C++ template code. After some investigation, I found that we have a stack-allocated object that was huge: InitializationSequence was 7016 bytes. This caused an overflow with deep call stacks in initialization code.

With these change, InitializationSequence is now 248 bytes.

With the original code, testing RelWithDebInfo on Windows 10, all the tests in SemaCXX took about 6s 800ms. The max template depth I could reach on my machine using the code in the issue was 708. After that, I would get -Wstack-exhausted warnings until crashing at 976 instantiations.

With these changes on the same machine, all the tests in SemaCXX took about 6s 500ms. The max template depth I could reach was 1492. After that, I would get -Wstack-exhausted warnings until crashing at 2898 instantiations.

This improves the behavior of #88330 but there's still an outstanding question of why we run out of stack space and crash in some circumstances before we're able to issue a diagnostic about stack space exhaustion.

@AaronBallman AaronBallman added c++ clang:frontend Language frontend issues, e.g. anything involving "Sema" quality-of-implementation labels Apr 12, 2024
@AaronBallman AaronBallman requested a review from erichkeane April 12, 2024 17:58
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Apr 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 12, 2024

@llvm/pr-subscribers-clang

Author: Aaron Ballman (AaronBallman)

Changes

We were crashing due to stack exhaustion on rather reasonable C++ template code. After some investigation, I found that we have a stack-allocated object that was huge: InitializationSequence was 7016 bytes. This caused an overflow with deep call stacks in initialization code.

With these change, InitializationSequence is now 248 bytes.

With the original code, testing RelWithDebInfo on Windows 10, all the tests in SemaCXX took about 6s 800ms. The max template depth I could reach on my machine using the code in the issue was 708. After that, I would get -Wstack-exhausted warnings until crashing at 976 instantiations.

With these changes on the same machine, all the tests in SemaCXX took about 6s 500ms. The max template depth I could reach was 1492. After that, I would get -Wstack-exhausted warnings until crashing at 2898 instantiations.

Fixes #88330


Patch is 30.26 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/88546.diff

11 Files Affected:

  • (modified) clang/include/clang/Sema/Initialization.h (+4-2)
  • (modified) clang/include/clang/Sema/Overload.h (+22-47)
  • (modified) clang/lib/Sema/SemaCodeComplete.cpp (+4-2)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+2-1)
  • (modified) clang/lib/Sema/SemaDeclCXX.cpp (+3-2)
  • (modified) clang/lib/Sema/SemaExpr.cpp (+4-3)
  • (modified) clang/lib/Sema/SemaExprCXX.cpp (+3-3)
  • (modified) clang/lib/Sema/SemaInit.cpp (+18-13)
  • (modified) clang/lib/Sema/SemaLookup.cpp (+1-1)
  • (modified) clang/lib/Sema/SemaOverload.cpp (+31-22)
  • (modified) clang/lib/Sema/SemaStmt.cpp (+1-1)
diff --git a/clang/include/clang/Sema/Initialization.h b/clang/include/clang/Sema/Initialization.h
index 2072cd8d1c3ef8..f41ca97f030066 100644
--- a/clang/include/clang/Sema/Initialization.h
+++ b/clang/include/clang/Sema/Initialization.h
@@ -1134,7 +1134,7 @@ class InitializationSequence {
   OverloadingResult FailedOverloadResult;
 
   /// The candidate set created when initialization failed.
-  OverloadCandidateSet FailedCandidateSet;
+  OverloadCandidateSet *FailedCandidateSet;
 
   /// The incomplete type that caused a failure.
   QualType FailedIncompleteType;
@@ -1403,7 +1403,9 @@ class InitializationSequence {
   /// Retrieve a reference to the candidate set when overload
   /// resolution fails.
   OverloadCandidateSet &getFailedCandidateSet() {
-    return FailedCandidateSet;
+    assert(FailedCandidateSet &&
+           "this should have been allocated in the constructor!");
+    return *FailedCandidateSet;
   }
 
   /// Get the overloading result, for when the initialization
diff --git a/clang/include/clang/Sema/Overload.h b/clang/include/clang/Sema/Overload.h
index 76311b00d2fc58..40800c7e1af381 100644
--- a/clang/include/clang/Sema/Overload.h
+++ b/clang/include/clang/Sema/Overload.h
@@ -874,7 +874,8 @@ class Sema;
     ConversionFixItGenerator Fix;
 
     /// Viable - True to indicate that this overload candidate is viable.
-    bool Viable : 1;
+    LLVM_PREFERRED_TYPE(bool)
+    unsigned Viable : 1;
 
     /// Whether this candidate is the best viable function, or tied for being
     /// the best viable function.
@@ -883,12 +884,14 @@ class Sema;
     /// was part of the ambiguity kernel: the minimal non-empty set of viable
     /// candidates such that all elements of the ambiguity kernel are better
     /// than all viable candidates not in the ambiguity kernel.
-    bool Best : 1;
+    LLVM_PREFERRED_TYPE(bool)
+    unsigned Best : 1;
 
     /// IsSurrogate - True to indicate that this candidate is a
     /// surrogate for a conversion to a function pointer or reference
     /// (C++ [over.call.object]).
-    bool IsSurrogate : 1;
+    LLVM_PREFERRED_TYPE(bool)
+    unsigned IsSurrogate : 1;
 
     /// IgnoreObjectArgument - True to indicate that the first
     /// argument's conversion, which for this function represents the
@@ -897,18 +900,20 @@ class Sema;
     /// implicit object argument is just a placeholder) or a
     /// non-static member function when the call doesn't have an
     /// object argument.
-    bool IgnoreObjectArgument : 1;
+    LLVM_PREFERRED_TYPE(bool)
+    unsigned IgnoreObjectArgument : 1;
 
     /// True if the candidate was found using ADL.
-    CallExpr::ADLCallKind IsADLCandidate : 1;
+    LLVM_PREFERRED_TYPE(CallExpr::ADLCallKind)
+    unsigned IsADLCandidate : 1;
 
     /// Whether this is a rewritten candidate, and if so, of what kind?
     LLVM_PREFERRED_TYPE(OverloadCandidateRewriteKind)
     unsigned RewriteKind : 2;
 
     /// FailureKind - The reason why this candidate is not viable.
-    /// Actually an OverloadFailureKind.
-    unsigned char FailureKind;
+    LLVM_PREFERRED_TYPE(OverloadFailureKind)
+    unsigned FailureKind : 5;
 
     /// The number of call arguments that were explicitly provided,
     /// to be used while performing partial ordering of function templates.
@@ -972,7 +977,9 @@ class Sema;
   private:
     friend class OverloadCandidateSet;
     OverloadCandidate()
-        : IsSurrogate(false), IsADLCandidate(CallExpr::NotADL), RewriteKind(CRK_None) {}
+        : IsSurrogate(false),
+          IsADLCandidate(static_cast<unsigned>(CallExpr::NotADL)),
+          RewriteKind(CRK_None) {}
   };
 
   /// OverloadCandidateSet - A set of overload candidates, used in C++
@@ -1070,57 +1077,25 @@ class Sema;
     };
 
   private:
-    SmallVector<OverloadCandidate, 16> Candidates;
-    llvm::SmallPtrSet<uintptr_t, 16> Functions;
-
-    // Allocator for ConversionSequenceLists. We store the first few of these
-    // inline to avoid allocation for small sets.
-    llvm::BumpPtrAllocator SlabAllocator;
+    ASTContext &Ctx;
+    SmallVector<OverloadCandidate, 4> Candidates;
+    llvm::SmallPtrSet<uintptr_t, 4> Functions;
 
     SourceLocation Loc;
     CandidateSetKind Kind;
     OperatorRewriteInfo RewriteInfo;
 
-    constexpr static unsigned NumInlineBytes =
-        24 * sizeof(ImplicitConversionSequence);
-    unsigned NumInlineBytesUsed = 0;
-    alignas(void *) char InlineSpace[NumInlineBytes];
-
     // Address space of the object being constructed.
     LangAS DestAS = LangAS::Default;
 
-    /// If we have space, allocates from inline storage. Otherwise, allocates
-    /// from the slab allocator.
-    /// FIXME: It would probably be nice to have a SmallBumpPtrAllocator
-    /// instead.
-    /// FIXME: Now that this only allocates ImplicitConversionSequences, do we
-    /// want to un-generalize this?
-    template <typename T>
-    T *slabAllocate(unsigned N) {
-      // It's simpler if this doesn't need to consider alignment.
-      static_assert(alignof(T) == alignof(void *),
-                    "Only works for pointer-aligned types.");
-      static_assert(std::is_trivial<T>::value ||
-                        std::is_same<ImplicitConversionSequence, T>::value,
-                    "Add destruction logic to OverloadCandidateSet::clear().");
-
-      unsigned NBytes = sizeof(T) * N;
-      if (NBytes > NumInlineBytes - NumInlineBytesUsed)
-        return SlabAllocator.Allocate<T>(N);
-      char *FreeSpaceStart = InlineSpace + NumInlineBytesUsed;
-      assert(uintptr_t(FreeSpaceStart) % alignof(void *) == 0 &&
-             "Misaligned storage!");
-
-      NumInlineBytesUsed += NBytes;
-      return reinterpret_cast<T *>(FreeSpaceStart);
-    }
 
     void destroyCandidates();
 
   public:
-    OverloadCandidateSet(SourceLocation Loc, CandidateSetKind CSK,
+    OverloadCandidateSet(ASTContext &Ctx, SourceLocation Loc,
+                         CandidateSetKind CSK,
                          OperatorRewriteInfo RewriteInfo = {})
-        : Loc(Loc), Kind(CSK), RewriteInfo(RewriteInfo) {}
+        : Ctx(Ctx), Loc(Loc), Kind(CSK), RewriteInfo(RewriteInfo) {}
     OverloadCandidateSet(const OverloadCandidateSet &) = delete;
     OverloadCandidateSet &operator=(const OverloadCandidateSet &) = delete;
     ~OverloadCandidateSet() { destroyCandidates(); }
@@ -1163,7 +1138,7 @@ class Sema;
     ConversionSequenceList
     allocateConversionSequences(unsigned NumConversions) {
       ImplicitConversionSequence *Conversions =
-          slabAllocate<ImplicitConversionSequence>(NumConversions);
+          Ctx.Allocate<ImplicitConversionSequence>(NumConversions);
 
       // Construct the new objects.
       for (unsigned I = 0; I != NumConversions; ++I)
diff --git a/clang/lib/Sema/SemaCodeComplete.cpp b/clang/lib/Sema/SemaCodeComplete.cpp
index c335017f243eb2..c561fdc93ba00c 100644
--- a/clang/lib/Sema/SemaCodeComplete.cpp
+++ b/clang/lib/Sema/SemaCodeComplete.cpp
@@ -6210,7 +6210,8 @@ QualType Sema::ProduceCallSignatureHelp(Expr *Fn, ArrayRef<Expr *> Args,
   Expr *NakedFn = Fn->IgnoreParenCasts();
   // Build an overload candidate set based on the functions we find.
   SourceLocation Loc = Fn->getExprLoc();
-  OverloadCandidateSet CandidateSet(Loc, OverloadCandidateSet::CSK_Normal);
+  OverloadCandidateSet CandidateSet(Context, Loc,
+                                    OverloadCandidateSet::CSK_Normal);
 
   if (auto ULE = dyn_cast<UnresolvedLookupExpr>(NakedFn)) {
     AddOverloadedCallCandidates(ULE, ArgsWithoutDependentTypes, CandidateSet,
@@ -6411,7 +6412,8 @@ QualType Sema::ProduceConstructorSignatureHelp(QualType Type,
   // FIXME: Provide support for variadic template constructors.
 
   if (CRD) {
-    OverloadCandidateSet CandidateSet(Loc, OverloadCandidateSet::CSK_Normal);
+    OverloadCandidateSet CandidateSet(Context, Loc,
+                                      OverloadCandidateSet::CSK_Normal);
     for (NamedDecl *C : LookupConstructors(CRD)) {
       if (auto *FD = dyn_cast<FunctionDecl>(C)) {
         // FIXME: we can't yet provide correct signature help for initializer
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 720e56692359b3..de22408cd16a03 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -19073,7 +19073,8 @@ static void ComputeSelectedDestructor(Sema &S, CXXRecordDecl *Record) {
   }
 
   SourceLocation Loc = Record->getLocation();
-  OverloadCandidateSet OCS(Loc, OverloadCandidateSet::CSK_Normal);
+  OverloadCandidateSet OCS(S.getASTContext(), Loc,
+                           OverloadCandidateSet::CSK_Normal);
 
   for (auto *Decl : Record->decls()) {
     if (auto *DD = dyn_cast<CXXDestructorDecl>(Decl)) {
diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp
index 51c14443d2d8f1..df6381e2cdf3bc 100644
--- a/clang/lib/Sema/SemaDeclCXX.cpp
+++ b/clang/lib/Sema/SemaDeclCXX.cpp
@@ -8198,7 +8198,8 @@ class DefaultedComparisonAnalyzer
     // we've already found there is no viable 'operator<=>' candidate (and are
     // considering synthesizing a '<=>' from '==' and '<').
     OverloadCandidateSet CandidateSet(
-        FD->getLocation(), OverloadCandidateSet::CSK_Operator,
+        S.getASTContext(), FD->getLocation(),
+        OverloadCandidateSet::CSK_Operator,
         OverloadCandidateSet::OperatorRewriteInfo(
             OO, FD->getLocation(),
             /*AllowRewrittenCandidates=*/!SpaceshipCandidates));
@@ -17409,7 +17410,7 @@ bool Sema::EvaluateStaticAssertMessageAsString(Expr *Message,
     LookupResult MemberLookup(*this, DN, Loc, Sema::LookupMemberName);
     LookupQualifiedName(MemberLookup, RD);
     Empty = MemberLookup.empty();
-    OverloadCandidateSet Candidates(MemberLookup.getNameLoc(),
+    OverloadCandidateSet Candidates(Context, MemberLookup.getNameLoc(),
                                     OverloadCandidateSet::CSK_Normal);
     if (MemberLookup.empty())
       return std::nullopt;
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index b294d2bd9f53f2..c14d484378a866 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -2500,7 +2500,7 @@ bool Sema::DiagnoseEmptyLookup(Scope *S, CXXScopeSpec &SS, LookupResult &R,
 
         // If there's a best viable function among the results, only mention
         // that one in the notes.
-        OverloadCandidateSet Candidates(R.getNameLoc(),
+        OverloadCandidateSet Candidates(Context, R.getNameLoc(),
                                         OverloadCandidateSet::CSK_Normal);
         AddOverloadedCallCandidates(R, ExplicitTemplateArgs, Args, Candidates);
         OverloadCandidateSet::iterator Best;
@@ -2548,7 +2548,7 @@ bool Sema::DiagnoseEmptyLookup(Scope *S, CXXScopeSpec &SS, LookupResult &R,
     NamedDecl *ND = Corrected.getFoundDecl();
     if (ND) {
       if (Corrected.isOverloaded()) {
-        OverloadCandidateSet OCS(R.getNameLoc(),
+        OverloadCandidateSet OCS(Context, R.getNameLoc(),
                                  OverloadCandidateSet::CSK_Normal);
         OverloadCandidateSet::iterator Best;
         for (NamedDecl *CD : Corrected) {
@@ -6501,7 +6501,8 @@ static TypoCorrection TryTypoCorrectionForCall(Sema &S, Expr *Fn,
           Sema::CTK_ErrorRecovery)) {
     if (NamedDecl *ND = Corrected.getFoundDecl()) {
       if (Corrected.isOverloaded()) {
-        OverloadCandidateSet OCS(NameLoc, OverloadCandidateSet::CSK_Normal);
+        OverloadCandidateSet OCS(S.getASTContext(), NameLoc,
+                                 OverloadCandidateSet::CSK_Normal);
         OverloadCandidateSet::iterator Best;
         for (NamedDecl *CD : Corrected) {
           if (FunctionDecl *FD = dyn_cast<FunctionDecl>(CD))
diff --git a/clang/lib/Sema/SemaExprCXX.cpp b/clang/lib/Sema/SemaExprCXX.cpp
index ce9d5c26e21858..0b7bda6d5f2f25 100644
--- a/clang/lib/Sema/SemaExprCXX.cpp
+++ b/clang/lib/Sema/SemaExprCXX.cpp
@@ -2586,7 +2586,7 @@ static bool resolveAllocationOverload(
     Sema &S, LookupResult &R, SourceRange Range, SmallVectorImpl<Expr *> &Args,
     bool &PassAlignment, FunctionDecl *&Operator,
     OverloadCandidateSet *AlignedCandidates, Expr *AlignArg, bool Diagnose) {
-  OverloadCandidateSet Candidates(R.getNameLoc(),
+  OverloadCandidateSet Candidates(S.getASTContext(), R.getNameLoc(),
                                   OverloadCandidateSet::CSK_Normal);
   for (LookupResult::iterator Alloc = R.begin(), AllocEnd = R.end();
        Alloc != AllocEnd; ++Alloc) {
@@ -3919,7 +3919,7 @@ static bool resolveBuiltinNewDeleteOverload(Sema &S, CallExpr *TheCall,
   R.suppressDiagnostics();
 
   SmallVector<Expr *, 8> Args(TheCall->arguments());
-  OverloadCandidateSet Candidates(R.getNameLoc(),
+  OverloadCandidateSet Candidates(S.getASTContext(), R.getNameLoc(),
                                   OverloadCandidateSet::CSK_Normal);
   for (LookupResult::iterator FnOvl = R.begin(), FnOvlEnd = R.end();
        FnOvl != FnOvlEnd; ++FnOvl) {
@@ -6488,7 +6488,7 @@ static bool TryClassUnification(Sema &Self, Expr *From, Expr *To,
 static bool FindConditionalOverload(Sema &Self, ExprResult &LHS, ExprResult &RHS,
                                     SourceLocation QuestionLoc) {
   Expr *Args[2] = { LHS.get(), RHS.get() };
-  OverloadCandidateSet CandidateSet(QuestionLoc,
+  OverloadCandidateSet CandidateSet(Self.getASTContext(), QuestionLoc,
                                     OverloadCandidateSet::CSK_Operator);
   Self.AddBuiltinOperatorCandidates(OO_Conditional, QuestionLoc, Args,
                                     CandidateSet);
diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 6b8ce0f633a3ea..883bdece74daf1 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -6113,8 +6113,10 @@ static bool TryOCLZeroOpaqueTypeInitialization(Sema &S,
 InitializationSequence::InitializationSequence(
     Sema &S, const InitializedEntity &Entity, const InitializationKind &Kind,
     MultiExprArg Args, bool TopLevelOfInitList, bool TreatUnavailableAsInvalid)
-    : FailedOverloadResult(OR_Success),
-      FailedCandidateSet(Kind.getLocation(), OverloadCandidateSet::CSK_Normal) {
+    : FailedOverloadResult(OR_Success), FailedCandidateSet(nullptr) {
+  FailedCandidateSet = S.getASTContext().Allocate<OverloadCandidateSet>();
+  FailedCandidateSet = new (FailedCandidateSet) OverloadCandidateSet(
+      S.getASTContext(), Kind.getLocation(), OverloadCandidateSet::CSK_Normal);
   InitializeFrom(S, Entity, Kind, Args, TopLevelOfInitList,
                  TreatUnavailableAsInvalid);
 }
@@ -6808,7 +6810,8 @@ static ExprResult CopyObject(Sema &S,
   // Perform overload resolution using the class's constructors. Per
   // C++11 [dcl.init]p16, second bullet for class types, this initialization
   // is direct-initialization.
-  OverloadCandidateSet CandidateSet(Loc, OverloadCandidateSet::CSK_Normal);
+  OverloadCandidateSet CandidateSet(S.getASTContext(), Loc,
+                                    OverloadCandidateSet::CSK_Normal);
   DeclContext::lookup_result Ctors = S.LookupConstructors(Class);
 
   OverloadCandidateSet::iterator Best;
@@ -6949,7 +6952,8 @@ static void CheckCXX98CompatAccessibleCopy(Sema &S,
     return;
 
   // Find constructors which would have been considered.
-  OverloadCandidateSet CandidateSet(Loc, OverloadCandidateSet::CSK_Normal);
+  OverloadCandidateSet CandidateSet(S.getASTContext(), Loc,
+                                    OverloadCandidateSet::CSK_Normal);
   DeclContext::lookup_result Ctors =
       S.LookupConstructors(cast<CXXRecordDecl>(Record->getDecl()));
 
@@ -9735,7 +9739,7 @@ bool InitializationSequence::Diagnose(Sema &S,
     switch (FailedOverloadResult) {
     case OR_Ambiguous:
 
-      FailedCandidateSet.NoteCandidates(
+      FailedCandidateSet->NoteCandidates(
           PartialDiagnosticAt(
               Kind.getLocation(),
               Failure == FK_UserConversionOverloadFailed
@@ -9749,7 +9753,8 @@ bool InitializationSequence::Diagnose(Sema &S,
       break;
 
     case OR_No_Viable_Function: {
-      auto Cands = FailedCandidateSet.CompleteCandidates(S, OCD_AllCandidates, Args);
+      auto Cands =
+          FailedCandidateSet->CompleteCandidates(S, OCD_AllCandidates, Args);
       if (!S.RequireCompleteType(Kind.getLocation(),
                                  DestType.getNonReferenceType(),
                           diag::err_typecheck_nonviable_condition_incomplete,
@@ -9759,7 +9764,7 @@ bool InitializationSequence::Diagnose(Sema &S,
           << OnlyArg->getType() << Args[0]->getSourceRange()
           << DestType.getNonReferenceType();
 
-      FailedCandidateSet.NoteCandidates(S, Args, Cands);
+      FailedCandidateSet->NoteCandidates(S, Args, Cands);
       break;
     }
     case OR_Deleted: {
@@ -9768,7 +9773,7 @@ bool InitializationSequence::Diagnose(Sema &S,
         << Args[0]->getSourceRange();
       OverloadCandidateSet::iterator Best;
       OverloadingResult Ovl
-        = FailedCandidateSet.BestViableFunction(S, Kind.getLocation(), Best);
+        = FailedCandidateSet->BestViableFunction(S, Kind.getLocation(), Best);
       if (Ovl == OR_Deleted) {
         S.NoteDeletedFunction(Best->Function);
       } else {
@@ -9946,7 +9951,7 @@ bool InitializationSequence::Diagnose(Sema &S,
     // bad.
     switch (FailedOverloadResult) {
       case OR_Ambiguous:
-        FailedCandidateSet.NoteCandidates(
+        FailedCandidateSet->NoteCandidates(
             PartialDiagnosticAt(Kind.getLocation(),
                                 S.PDiag(diag::err_ovl_ambiguous_init)
                                     << DestType << ArgsRange),
@@ -10000,7 +10005,7 @@ bool InitializationSequence::Diagnose(Sema &S,
           break;
         }
 
-        FailedCandidateSet.NoteCandidates(
+        FailedCandidateSet->NoteCandidates(
             PartialDiagnosticAt(
                 Kind.getLocation(),
                 S.PDiag(diag::err_ovl_no_viable_function_in_init)
@@ -10011,7 +10016,7 @@ bool InitializationSequence::Diagnose(Sema &S,
       case OR_Deleted: {
         OverloadCandidateSet::iterator Best;
         OverloadingResult Ovl
-          = FailedCandidateSet.BestViableFunction(S, Kind.getLocation(), Best);
+          = FailedCandidateSet->BestViableFunction(S, Kind.getLocation(), Best);
         if (Ovl != OR_Deleted) {
           S.Diag(Kind.getLocation(), diag::err_ovl_deleted_init)
               << DestType << ArgsRange;
@@ -10088,7 +10093,7 @@ bool InitializationSequence::Diagnose(Sema &S,
       << Args[0]->getSourceRange();
     OverloadCandidateSet::iterator Best;
     OverloadingResult Ovl
-      = FailedCandidateSet.BestViableFunction(S, Kind.getLocation(), Best);
+      = FailedCandidateSet->BestViableFunction(S, Kind.getLocation(), Best);
     (void)Ovl;
     assert(Ovl == OR_Success && "Inconsistent overload resolution");
     CXXConstructorDecl *CtorDecl = cast<CXXConstructorDecl>(Best->Function);
@@ -10818,7 +10823,7 @@ QualType Sema::DeduceTemplateSpecializationFromInitializer(
   //
   // Since we know we're initializing a class type of a type unrelated to that
   // of the initializer, this reduces to something fairly reasonable.
-  OverloadCandidateSet Candidates(Kind.getLocation(),
+  OverloadCandidateSet Candidates(Context, Kind.getLocation(),
                                   OverloadCandidateSet::CSK_Normal);
   OverloadCandidateSet::iterator Best;
 
diff --git a/clang/lib/Sema/SemaLookup.cpp b/clang/lib/Sema/SemaLookup.cpp
index d65f52b8efe81f..eedba5c1394770 100644
--- a/clang/lib/Sema/SemaLookup.cpp
+++ b/clang/lib/Sema/SemaLookup.cpp
@@ -3481,7 +3481,7 @@ Sema::LookupSpecialMember(CXXRecordDecl *RD, CXXSpecialMemberKind SM,
   // Now we perform lookup on the name we computed earlier and do overload
   // resolution. Lookup is only performed directly into the class since there
   // will always be a (possibly implicit) declaration to shadow any others.
-  OverloadCandidateSet OCS(LookupLoc, OverloadCandidateSet::CSK_Normal);
+  OverloadCandidateSet OCS(Context, LookupLoc, OverloadCandidateSet::CSK_Normal);
   DeclContext::lookup_result R = RD->lookup(Name);
 
   if (R.empty()) {
diff --git a/clang/lib/Sema/SemaOverload.cpp b/clang/lib/Sema/SemaOverload.cpp
index e1155dc2d5d285..e94db8aee144b3 100644
--- a/clang/lib/Sema/SemaOverload.cpp
+++ b/clang/lib/Sema/SemaOverload.cpp
@@ -1057,7 +1057,8 @@ bool OverloadCandidateSet::OperatorRewriteInfo::shouldAddReversed(
 void Overlo...
[truncated]

Copy link

github-actions bot commented Apr 12, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

clang/docs/ReleaseNotes.rst Outdated Show resolved Hide resolved
clang/lib/Sema/SemaOverload.cpp Outdated Show resolved Hide resolved
FailedCandidateSet(Kind.getLocation(), OverloadCandidateSet::CSK_Normal) {
: FailedOverloadResult(OR_Success), FailedCandidateSet(nullptr) {
FailedCandidateSet = S.getASTContext().Allocate<OverloadCandidateSet>();
FailedCandidateSet = new (FailedCandidateSet) OverloadCandidateSet(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a future optimization, I wonder if we could defer this allocation until we already know that we've failed? Perhaps in SetOverloadFailure? Not sure I can grok the analysis at the moment to know how possible this is though.

Co-authored-by: Erich Keane <ekeane@nvidia.com>
@AaronBallman AaronBallman changed the title Improve stack usage to increase template instantiation depth Improve stack usage to increase recursive initialization depth Apr 12, 2024
@AaronBallman
Copy link
Collaborator Author

The precommit CI failures are real; I'm still investigating what's going sideways.

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a roller coaster!
LGTM

These don't need to be long-lived and there's likely to be a lot of
overload and initialization code in a TU, so this should hopefully
reduce memory pressure from the heap allocations. It also simplifies
the patch by not needing to pass around an ASTContext.
@AaronBallman AaronBallman merged commit 4082a75 into llvm:main Apr 16, 2024
4 of 5 checks passed
@AaronBallman AaronBallman deleted the aballman-stack-size branch April 16, 2024 17:48
@vitalybuka
Copy link
Collaborator

@nikic
Copy link
Contributor

nikic commented Apr 17, 2024

This change seems to cause significant compile-time regressions for C++ code (https://llvm-compile-time-tracker.com/compare.php?from=184ba038ac1d444980b3e554b0057f3f30c516ab&to=4082a7554521572a65a5a0008c4661a534df659d&stat=instructions%3Au).

Probably most damning are the times for the clang build itself (https://llvm-compile-time-tracker.com/compare_clang.php?from=184ba038ac1d444980b3e554b0057f3f30c516ab&to=4082a7554521572a65a5a0008c4661a534df659d&stat=instructions%3Au&sortBy=interestingness) where files like X86ISelDAGToDAG.cpp regress by >3% even in an optimized build.

@vitalybuka
Copy link
Collaborator

have ideas about leak, but not sure what to do with perf regression and #88330

Proposing revert #89006

for (unsigned I = 0; I != NumConversions; ++I)
new (&Conversions[I]) ImplicitConversionSequence();

new ImplicitConversionSequence[NumConversions];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unusual to keep ownership in ArrayRef.
Could we make ConversionSequenceList just a std::vector if it owns the memory?

vitalybuka added a commit that referenced this pull request Apr 17, 2024
…h" (#89006)

Reverts #88546

Leak and performance regression.
Details in #88546
@AaronBallman AaronBallman restored the aballman-stack-size branch April 19, 2024 12:53
@cor3ntin
Copy link
Contributor

@AaronBallman ping

@AaronBallman
Copy link
Collaborator Author

@AaronBallman ping

I've not been able to get back on to this, unfortunately. I'm not certain when I'll have time, either.

@cor3ntin
Copy link
Contributor

It might be worth reopening the PR and asking for help? otherwise we'll forget

And some things that should not have been forgotten were lost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category quality-of-implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants