Skip to content

Commit a01be04

Browse files
authoredFeb 11, 2022
Increase arm32/arm64 maximum instruction group size (#65153)
We require that the maximum number of prolog instructions all fit in one instruction group. Recent changes appear to have increased the number of instructions we are generating the prolog, leading to NOWAY assert on Release builds and test failure on linux-arm64. Bump up the number to avoid this problem, and leave some headroom for possible additional needs. Fixes #64162, #64793.
1 parent 77d6833 commit a01be04

File tree

2 files changed

+42
-10
lines changed

2 files changed

+42
-10
lines changed
 

‎src/coreclr/jit/emit.cpp

+3-2
Original file line numberDiff line numberDiff line change
@@ -1454,9 +1454,10 @@ void* emitter::emitAllocAnyInstr(size_t sz, emitAttr opsz)
14541454

14551455
assert(IsCodeAligned(emitCurIGsize));
14561456

1457-
/* Make sure we have enough space for the new instruction */
1457+
// Make sure we have enough space for the new instruction.
1458+
// `igInsCnt` is currently a byte, so we can't have more than 255 instructions in a single insGroup.
14581459

1459-
if ((emitCurIGfreeNext + sz >= emitCurIGfreeEndp) || emitForceNewIG)
1460+
if ((emitCurIGfreeNext + sz >= emitCurIGfreeEndp) || emitForceNewIG || (emitCurIGinsCnt >= 255))
14601461
{
14611462
emitNxtIG(true);
14621463
}

‎src/coreclr/jit/emit.h

+39-8
Original file line numberDiff line numberDiff line change
@@ -1785,19 +1785,50 @@ class emitter
17851785
void emitHandleMemOp(GenTreeIndir* indir, instrDesc* id, insFormat fmt, instruction ins);
17861786
void spillIntArgRegsToShadowSlots();
17871787

1788-
/************************************************************************/
1789-
/* The logic that creates and keeps track of instruction groups */
1790-
/************************************************************************/
1788+
/************************************************************************/
1789+
/* The logic that creates and keeps track of instruction groups */
1790+
/************************************************************************/
1791+
1792+
// SC_IG_BUFFER_SIZE defines the size, in bytes, of the single, global instruction group buffer.
1793+
// When a label is reached, or the buffer is filled, the precise amount of the buffer that was
1794+
// used is copied to a newly allocated, precisely sized buffer, and the global buffer is reset
1795+
// for use with the next set of instructions (see emitSavIG). If the buffer was filled before
1796+
// reaching a label, the next instruction group will be an "overflow", or "extension" group
1797+
// (marked with IGF_EXTEND). Thus, the size of the global buffer shouldn't matter (as long as it
1798+
// can hold at least one of the largest instruction descriptor forms), since we can always overflow
1799+
// to subsequent instruction groups.
1800+
//
1801+
// The only place where this fixed instruction group size is a problem is in the main function prolog,
1802+
// where we only support a single instruction group, and no extension groups. We should really fix that.
1803+
// Thus, the buffer size needs to be large enough to hold the maximum number of instructions that
1804+
// can possibly be generated into the prolog instruction group. That is difficult to statically determine.
1805+
//
1806+
// If we do generate an overflow prolog group, we will hit a NOWAY assert and fall back to MinOpts.
1807+
// This should reduce the number of instructions generated into the prolog.
1808+
//
1809+
// Note that OSR prologs require additional code not seen in normal prologs.
1810+
//
1811+
// Also, note that DEBUG and non-DEBUG builds have different instrDesc sizes, and there are multiple
1812+
// sizes of instruction descriptors, so the number of instructions that will fit in the largest
1813+
// instruction group depends on the instruction mix as well as DEBUG/non-DEBUG build type. See the
1814+
// EMITTER_STATS output for various statistics related to this.
1815+
//
1816+
CLANG_FORMAT_COMMENT_ANCHOR;
17911817

17921818
#ifdef TARGET_ARMARCH
1793-
// The only place where this limited instruction group size is a problem is
1794-
// in the prolog, where we only support a single instruction group. We should really fix that.
17951819
// ARM32 and ARM64 both can require a bigger prolog instruction group. One scenario is where
17961820
// a function uses all the incoming integer and single-precision floating-point arguments,
17971821
// and must store them all to the frame on entry. If the frame is very large, we generate
1798-
// ugly code like "movw r10, 0x488; add r10, sp; vstr s0, [r10]" for each store, which
1799-
// eats up our insGroup buffer.
1800-
#define SC_IG_BUFFER_SIZE (100 * sizeof(emitter::instrDesc) + 14 * SMALL_IDSC_SIZE)
1822+
// ugly code like:
1823+
// movw r10, 0x488
1824+
// add r10, sp
1825+
// vstr s0, [r10]
1826+
// for each store, or, to load arguments into registers:
1827+
// movz xip1, #0x6cd0
1828+
// movk xip1, #2 LSL #16
1829+
// ldr w8, [fp, xip1] // [V10 arg10]
1830+
// which eats up our insGroup buffer.
1831+
#define SC_IG_BUFFER_SIZE (200 * sizeof(emitter::instrDesc))
18011832
#else // !TARGET_ARMARCH
18021833
#define SC_IG_BUFFER_SIZE (50 * sizeof(emitter::instrDesc) + 14 * SMALL_IDSC_SIZE)
18031834
#endif // !TARGET_ARMARCH

0 commit comments

Comments
 (0)
Please sign in to comment.