Try changing alignment #76451

PeterSolMS · 2022-09-30T15:18:04Z

Looking at CPU traces for microbenchmarks, I noticed a hotspot in memset (the flavor that uses AVX2 instructions) for the instruction that clears the very last double quadword at the end of an allocation context. Also, the buffer being cleared is not aligned on a 32-byte boundary.

Two tiny changes address this:

adding additional padding at the start of regions align the allocation context for the microbenchmark cases.
increasing CLR_SIZE slightly ensure the end of an allocation context doesn't consistently fall on a page boundary.

Why change 2. helps is not clear, but the measurements say it does - the Perf_String.Replace_Char_Custom benchmark regressed by about 1.8% for regions vs. segments without these changes, but shows the same or slightly better performance with the changes.

…ently on a new page for microbenchmarks.

ghost · 2022-09-30T15:18:42Z

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

Looking at CPU traces for microbenchmarks, I noticed a hotspot in memset (the flavor that uses AVX2 instructions) for the instruction that clears the very last double quadword at the end of an allocation context. Also, the buffer being cleared is not aligned on a 32-byte boundary.

Two tiny changes address this:

adding additional padding at the start of regions align the allocation context for the microbenchmark cases.
increasing CLR_SIZE slightly ensure the end of an allocation context doesn't consistently fall on a page boundary.

Why change 2. helps is not clear, but the measurements say it does - the Perf_String.Replace_Char_Custom benchmark regressed by about 1.8% for regions vs. segments without these changes, but shows the same or slightly better performance with the changes.

Author:	PeterSolMS
Assignees:	PeterSolMS
Labels:	`area-GC-coreclr`
Milestone:	-

mangod9 · 2022-09-30T16:56:10Z

Hmm, interesting finding. My worry here would be whether we are "over-fitting" the perf benefits for the microbenchmarks but probably arent quite sure whether real scenarios would see benefits or regressions?

Maoni0 · 2022-09-30T23:38:27Z

this applies to all scenarios, not just microbenchmarks. this is how we clear memory in general. we are using 8 bytes more per region which is a tiny amount but we avoid the unaligned memset's for the most part so 1) is definitely a good thing.

I'm also unclear why 2) matters, maybe now the latency of going to the next page is partially hidden by last clear since the 2nd to last clear hits it now? we can take a look at the individual instruction cost in the loop when you are back.

PeterSolMS · 2022-10-14T09:49:05Z

One reason I found why 2) is beneficial is that making the size 8k+32 minimizes the number of vmovdqu instructions - at 8k we execute 9 of them, while with 8k +32 we only execute 1. Traces I collected with a C++ replica of the allocator's behavior also show that hitting a new page with a vmovdqu to an unaligned address is particularly high cost, presumably because an access straddling two pages is more complicated to handle.

PeterSolMS added 3 commits September 30, 2022 12:44

Change alignment of where we start allocating in a region.

6ba0a48

Merge branch 'main' into Try_changing_alignment

cb7ee62

Increase CLR_SIZE so the end of the cleared area doesn't fall consist…

5c84173

…ently on a new page for microbenchmarks.

PeterSolMS requested review from cshung, Maoni0, mangod9 and mrsharm September 30, 2022 15:18

ghost assigned PeterSolMS Sep 30, 2022

dotnet-issue-labeler bot added the area-GC-coreclr label Sep 30, 2022

Maoni0 approved these changes Oct 13, 2022

View reviewed changes

Merge branch 'main' into Try_changing_alignment

b68514d

PeterSolMS merged commit d47767a into dotnet:main Oct 14, 2022

ghost locked as resolved and limited conversation to collaborators Nov 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try changing alignment #76451

Try changing alignment #76451

PeterSolMS commented Sep 30, 2022

ghost commented Sep 30, 2022

mangod9 commented Sep 30, 2022

Maoni0 commented Sep 30, 2022

PeterSolMS commented Oct 14, 2022

Try changing alignment #76451

Try changing alignment #76451

Conversation

PeterSolMS commented Sep 30, 2022

ghost commented Sep 30, 2022

mangod9 commented Sep 30, 2022

Maoni0 commented Sep 30, 2022

PeterSolMS commented Oct 14, 2022