Windows cpuSet for application has different performance for different textual representations of the same cpuSet #576

RobertHenry6bev · 2023-04-20T23:03:11Z

I run the dotnet/aspnet/teche/plaintext benchmark so the PlatformBenchmarks application is on a modern Intel server machine, running modern Windows 11 2022. The load generator is on an adjacent server in the same rack; the network link is not a bottleneck. The server has 2 sockets per board, 64 cores per socket, with 2 way SMT enabled, for a total of 128 "logical processors". There are 2 NUMA domains.

The local apparent maximum rps is to set the cpuSet of the application PlatformBenchmarks to 26 cores in the same NUMA domain.

The cpuSet specification of "0-25" runs 483krps.
The cpuSet specification of "0-0,1-1,2-2,3-3,4-4,5-5,6-6,7-7,8-8,9-9,10-10,11-11,12-12,13-13,14-14,15-15,16-16,17-17,18-18,19-19,20-20,21-21,22-22,23-23,24-24,25-25" runs slower by 10%, eg at 447krps.

This is repeatable.

I would expect semantically identical specifications of the cpuSet to have equivalent behavior at runtime.

Speculation: the long form "0-0,1-1, ..." incrementally tells the kernel what the cpuset is. perhaps incremental change runs afoul of the NUMA domain?

I need cpuSet to "do the right thing" so I can experiment with NUMA splits, uncore routing, and more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows cpuSet for application has different performance for different textual representations of the same cpuSet #576

Windows cpuSet for application has different performance for different textual representations of the same cpuSet #576

RobertHenry6bev commented Apr 20, 2023

Windows cpuSet for application has different performance for different textual representations of the same cpuSet #576

Windows cpuSet for application has different performance for different textual representations of the same cpuSet #576

Comments

RobertHenry6bev commented Apr 20, 2023