GC's VXSort question #64164

EgorBo · 2022-01-23T15:20:39Z

I noticed that GC uses VXSort (AVX2/AVX512) but only on Windows-x64. So I assume it has to be enabled for Linux-x64 and rewritten to NEON for Arm64?

(a screenshot, because it's not possible to reference lines in gc.cpp via github 😄)

I only tested it on Plaintext-MVC benchmark (allocates a lot of short-living objects) on our perf-lab and it seems like VXSort regresses P90 across 7 runs and has no effect on RPS. Also, it adds 227Kb to native size (for coreclr.dll+clrgc.dll combined)

Is there a scenario I can simulate on our perflab to see benefits from it or it only targets real world large workloads?
I am asking because I am wondering if it worth porting to NEON SIMD for arm64.

ghost · 2022-01-23T15:20:46Z

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

I noticed that GC uses VXSort (AVX2/AVX512) but only on Windows-x64. So I assume it has to be enabled for Linux-x64 and rewritten to NEON for Arm64?

(a screenshot, because it's not possible to reference lines in gc.cpp via github 😄)

I only tested it on Plaintext-MVC benchmark (allocates a lot of short-living objects) on our perf-lab and it seems like it VXSort regresses P95 across 7 runs and has no effect on RPS. Also, it adds 227Kb to native size (for coreclr.dll+clrgc.dll combined)

Is there a scenario I can simulate on our perflab to see benefits from it or it only targets real world large workloads?
I am asking because I am wondering if it worth porting to NEON SIMD for arm64.

Author:	EgorBo
Assignees:	-
Labels:	`area-Diagnostics-coreclr`, `untriaged`
Milestone:	-

ghost · 2022-01-23T15:21:23Z

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

I noticed that GC uses VXSort (AVX2/AVX512) but only on Windows-x64. So I assume it has to be enabled for Linux-x64 and rewritten to NEON for Arm64?

(a screenshot, because it's not possible to reference lines in gc.cpp via github 😄)

I only tested it on Plaintext-MVC benchmark (allocates a lot of short-living objects) on our perf-lab and it seems like it VXSort regresses P95 across 7 runs and has no effect on RPS. Also, it adds 227Kb to native size (for coreclr.dll+clrgc.dll combined)

Is there a scenario I can simulate on our perflab to see benefits from it or it only targets real world large workloads?
I am asking because I am wondering if it worth porting to NEON SIMD for arm64.

Author:	EgorBo
Assignees:	-
Labels:	`question`, `area-GC-coreclr`, `untriaged`
Milestone:	-

danmoseley · 2022-01-23T16:01:59Z

@PeterSolMS

Linking #37159

kunalspathak · 2022-01-24T18:32:45Z

Some notes about INTROSORT is in #60166 (comment).

Maoni0 · 2022-07-07T23:43:04Z

thanks @EgorBo for the data.

that's interesting because if you are just allocating some temp objects you shouldn't even hit the vectorized sorting code path. if you took a trace with cpu samples, do you see any samples captured in do_vxsort at all (or if you set a bp on do_vxsort do you see it get hit?)? just checking if it's a matter of "the benchmark is so small and any code change could disturb this" or is it really caused by the vectorized sorting.

Maoni0 · 2022-07-07T23:43:34Z

I don't think this needs to be 7.0 but let me know if you don't agree.

dotnet-issue-labeler bot added area-Diagnostics-coreclr untriaged New issue has not been triaged by the area owner labels Jan 23, 2022

EgorBo added area-GC-coreclr question Answer questions and provide assistance, not an issue with source code or documentation. and removed area-Diagnostics-coreclr labels Jan 23, 2022

mangod9 removed the untriaged New issue has not been triaged by the area owner label Feb 28, 2022

mangod9 added this to the 7.0.0 milestone Feb 28, 2022

Maoni0 modified the milestones: 7.0.0, Future Jul 7, 2022

kunalspathak mentioned this issue Oct 9, 2024

ARM64 GC: Use SVE when sorting the mark list #108473

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GC's VXSort question #64164

GC's VXSort question #64164

EgorBo commented Jan 23, 2022 •

edited

Loading

ghost commented Jan 23, 2022

ghost commented Jan 23, 2022

danmoseley commented Jan 23, 2022

kunalspathak commented Jan 24, 2022

Maoni0 commented Jul 7, 2022

Maoni0 commented Jul 7, 2022

GC's VXSort question #64164

GC's VXSort question #64164

Comments

EgorBo commented Jan 23, 2022 • edited Loading

ghost commented Jan 23, 2022

ghost commented Jan 23, 2022

danmoseley commented Jan 23, 2022

kunalspathak commented Jan 24, 2022

Maoni0 commented Jul 7, 2022

Maoni0 commented Jul 7, 2022

EgorBo commented Jan 23, 2022 •

edited

Loading