Skip to content

Commit

Permalink
Try changing alignment (#76451)
Browse files Browse the repository at this point in the history
Looking at CPU traces for microbenchmarks, I noticed a hotspot in memset (the flavor that uses AVX2 instructions) for the instruction that clears the very last double quadword at the end of an allocation context. Also, the buffer being cleared is not aligned on a 32-byte boundary.

Two tiny changes address this:

1. adding additional padding at the start of regions align the allocation context for the microbenchmark cases.
2. increasing CLR_SIZE slightly ensure the end of an allocation context doesn't consistently fall on a page boundary.

Change 1 makes sure we start with an aligned allocation context at the start of a region.
Change 2 minimizes the number of movdqu instructions executed and makes sure we don't concistently hit a new page at the end of the memset range.
  • Loading branch information
PeterSolMS authored Oct 14, 2022
1 parent 65d2064 commit d47767a
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 2 deletions.
4 changes: 2 additions & 2 deletions src/coreclr/gc/gc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1839,9 +1839,9 @@ uint8_t* gc_heap::pad_for_alignment_large (uint8_t* newAlloc, int requiredAlignm

//CLR_SIZE is the max amount of bytes from gen0 that is set to 0 in one chunk
#ifdef SERVER_GC
#define CLR_SIZE ((size_t)(8*1024))
#define CLR_SIZE ((size_t)(8*1024+32))
#else //SERVER_GC
#define CLR_SIZE ((size_t)(8*1024))
#define CLR_SIZE ((size_t)(8*1024+32))
#endif //SERVER_GC

#define END_SPACE_AFTER_GC (loh_size_threshold + MAX_STRUCTALIGN)
Expand Down
1 change: 1 addition & 0 deletions src/coreclr/gc/gcpriv.h
Original file line number Diff line number Diff line change
Expand Up @@ -5653,6 +5653,7 @@ struct gap_reloc_pair

struct DECLSPEC_ALIGN(8) aligned_plug_and_gap
{
size_t additional_pad;
plug_and_gap plugandgap;
};

Expand Down

0 comments on commit d47767a

Please sign in to comment.