Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono][Perf] MonoAOT Perf_Single and Perf_Double Regressions on 6/3/2024 6:35:27 PM #104076

Open
performanceautofiler bot opened this issue Jun 11, 2024 · 28 comments · May be fixed by #105038
Open

[mono][Perf] MonoAOT Perf_Single and Perf_Double Regressions on 6/3/2024 6:35:27 PM #104076

performanceautofiler bot opened this issue Jun 11, 2024 · 28 comments · May be fixed by #105038
Assignees
Labels
arch-arm64 arch-x64 area-System.Numerics needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration os-linux Linux OS (any supported distro) runtime-mono specific to the Mono runtime tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@performanceautofiler
Copy link

Run Information

Name Value
Architecture x64
OS ubuntu 22.04
Queue TigerUbuntu
Baseline 59e8bbcf83b664c3de6cfa553d9bbfad76578765
Compare 9d02188cdd26d4dfc26e3f9d4e843c6ae78c1b1c
Diff Diff
Configs CompilationMode:tiered, LLVM:true, MonoAOT:true, MonoInterpreter:false, RunKind:micro_mono

Regressions in System.Numerics.Tests.Perf_BitOperations

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
367.02 ns 456.62 ns 1.24 0.03 False
374.80 ns 442.65 ns 1.18 0.03 False
364.93 ns 451.19 ns 1.24 0.04 False

graph
graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Repro Steps

Prerequisites (Files either built locally (with build.(sh/cmd) or downloaded from payload above (if same system setup) (in this order))

  • Libraries build extracted to runtime/artifacts or build instructions: Libraries README args: -subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0
  • CoreCLR product build extracted to runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release, build instructions: CoreCLR README args: -subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0
  • AOT MONO build extracted to runtime/artifacts/bin/mono/$RunOS.$RunArch.Release, build instructions: MONO README args: -arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release /p:CrossBuild=false /p:MonoLLVMUseCxx11Abi=false
  • Dotnet SDK installed for dotnet commands
  • Running commands from the runtime folder

Linux

# Set $RunDir to the runtime directory
RunDir=`pwd`

# Set the OS, arch, and OSId
RunOS='linux'
RunOSId='linux'
RunArch='x64'

# Create aot directory 
mkdir -p $RunDir/artifacts/bin/aot/sgen
mkdir -p $RunDir/artifacts/bin/aot/pack
cp -r $RunDir/artifacts/obj/mono/$RunOS.$RunArch.Release/mono/* $RunDir/artifacts/bin/aot/sgen
cp -r $RunDir/artifacts/bin/microsoft.netcore.app.runtime.$RunOS-$RunArch/Release/* $RunDir/artifacts/bin/aot/pack

# Create Core Root
$RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance

# One line run:
python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Numerics.Tests.Perf_BitOperations*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Numerics.Tests.Perf_BitOperations* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages --buildTimeout 1200

Windows

# Set $RunDir to the runtime directory
$RunDir="FullPathHere"

# Set the OS, arch, and OSId
RunOS='windows'
RunOSId='win'
RunArch='x64'

# Create aot directory
mkdir $RunDir\artifacts\bin\aot\sgen
mkdir $RunDir\artifacts\bin\aot\pack
xcopy $RunDir\artifacts\obj\mono\$RunOS.$RunArch.Release\mono $RunDir\artifacts\bin\aot\sgen\ /e /y
xcopy $RunDir\artifacts\bin\microsoft.netcore.app.runtime.$RunOSId-$RunArch\Release $RunDir\artifacts\bin\aot\pack\ /e /y

# Create Core Root
$RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance

# One line run:
python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Numerics.Tests.Perf_BitOperations*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Numerics.Tests.Perf_BitOperations* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack -aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages --buildTimeout 1200

System.Numerics.Tests.Perf_BitOperations.LeadingZeroCount_uint

ETL Files

Histogram

JIT Disasms

System.Numerics.Tests.Perf_BitOperations.TrailingZeroCount_ulong

ETL Files

Histogram

JIT Disasms

System.Numerics.Tests.Perf_BitOperations.LeadingZeroCount_ulong

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS ubuntu 22.04
Queue TigerUbuntu
Baseline 59e8bbcf83b664c3de6cfa553d9bbfad76578765
Compare 9d02188cdd26d4dfc26e3f9d4e843c6ae78c1b1c
Diff Diff
Configs CompilationMode:tiered, LLVM:true, MonoAOT:true, MonoInterpreter:false, RunKind:micro_mono

Regressions in System.Tests.Perf_Double

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
222.68 ns 257.90 ns 1.16 0.02 False
140.90 ns 171.83 ns 1.22 0.03 False
138.38 ns 169.57 ns 1.23 0.03 True
281.70 ns 319.75 ns 1.14 0.02 False
293.86 ns 324.00 ns 1.10 0.03 False
140.93 ns 172.84 ns 1.23 0.02 True
204.85 ns 242.46 ns 1.18 0.02 True
283.86 ns 329.50 ns 1.16 0.04 False
284.34 ns 319.48 ns 1.12 0.02 False

graph
graph
graph
graph
graph
graph
graph
graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Repro Steps

Prerequisites (Files either built locally (with build.(sh/cmd) or downloaded from payload above (if same system setup) (in this order))

  • Libraries build extracted to runtime/artifacts or build instructions: Libraries README args: -subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0
  • CoreCLR product build extracted to runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release, build instructions: CoreCLR README args: -subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0
  • AOT MONO build extracted to runtime/artifacts/bin/mono/$RunOS.$RunArch.Release, build instructions: MONO README args: -arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release /p:CrossBuild=false /p:MonoLLVMUseCxx11Abi=false
  • Dotnet SDK installed for dotnet commands
  • Running commands from the runtime folder

Linux

# Set $RunDir to the runtime directory
RunDir=`pwd`

# Set the OS, arch, and OSId
RunOS='linux'
RunOSId='linux'
RunArch='x64'

# Create aot directory 
mkdir -p $RunDir/artifacts/bin/aot/sgen
mkdir -p $RunDir/artifacts/bin/aot/pack
cp -r $RunDir/artifacts/obj/mono/$RunOS.$RunArch.Release/mono/* $RunDir/artifacts/bin/aot/sgen
cp -r $RunDir/artifacts/bin/microsoft.netcore.app.runtime.$RunOS-$RunArch/Release/* $RunDir/artifacts/bin/aot/pack

# Create Core Root
$RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance

# One line run:
python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Tests.Perf_Double*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Tests.Perf_Double* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages --buildTimeout 1200

Windows

# Set $RunDir to the runtime directory
$RunDir="FullPathHere"

# Set the OS, arch, and OSId
RunOS='windows'
RunOSId='win'
RunArch='x64'

# Create aot directory
mkdir $RunDir\artifacts\bin\aot\sgen
mkdir $RunDir\artifacts\bin\aot\pack
xcopy $RunDir\artifacts\obj\mono\$RunOS.$RunArch.Release\mono $RunDir\artifacts\bin\aot\sgen\ /e /y
xcopy $RunDir\artifacts\bin\microsoft.netcore.app.runtime.$RunOSId-$RunArch\Release $RunDir\artifacts\bin\aot\pack\ /e /y

# Create Core Root
$RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance

# One line run:
python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Tests.Perf_Double*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Tests.Perf_Double* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack -aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages --buildTimeout 1200

System.Tests.Perf_Double.ToStringWithFormat(value: -1.7976931348623157E+308, format: "E")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Double.ToStringWithFormat(value: 12345, format: "R")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Double.ToStringWithCultureInfo(value: 12345, culture: zh)

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Double.ToStringWithFormat(value: 1.7976931348623157E+308, format: "G")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Double.ToStringWithCultureInfo(value: -1.7976931348623157E+308, culture: zh)

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Double.ToString(value: 12345)

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Double.ToStringWithFormat(value: 12345, format: "E")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Double.ToString(value: 1.7976931348623157E+308)

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Double.ToStringWithFormat(value: 1.7976931348623157E+308, format: "R")

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS ubuntu 22.04
Queue TigerUbuntu
Baseline 59e8bbcf83b664c3de6cfa553d9bbfad76578765
Compare 9d02188cdd26d4dfc26e3f9d4e843c6ae78c1b1c
Diff Diff
Configs CompilationMode:tiered, LLVM:true, MonoAOT:true, MonoInterpreter:false, RunKind:micro_mono

Regressions in System.Tests.Perf_Single

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
252.50 ns 288.20 ns 1.14 0.02 False
192.62 ns 236.71 ns 1.23 0.03 False
206.54 ns 248.95 ns 1.21 0.04 False
201.52 ns 247.55 ns 1.23 0.04 True
142.06 ns 179.25 ns 1.26 0.03 True
231.39 ns 266.11 ns 1.15 0.07 True
148.60 ns 180.38 ns 1.21 0.03 True
141.18 ns 170.30 ns 1.21 0.05 True
199.01 ns 239.20 ns 1.20 0.04 True
208.50 ns 250.74 ns 1.20 0.02 False
139.45 ns 172.92 ns 1.24 0.03 True

graph
graph
graph
graph
graph
graph
graph
graph
graph
graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Repro Steps

Prerequisites (Files either built locally (with build.(sh/cmd) or downloaded from payload above (if same system setup) (in this order))

  • Libraries build extracted to runtime/artifacts or build instructions: Libraries README args: -subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0
  • CoreCLR product build extracted to runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release, build instructions: CoreCLR README args: -subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0
  • AOT MONO build extracted to runtime/artifacts/bin/mono/$RunOS.$RunArch.Release, build instructions: MONO README args: -arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release /p:CrossBuild=false /p:MonoLLVMUseCxx11Abi=false
  • Dotnet SDK installed for dotnet commands
  • Running commands from the runtime folder

Linux

# Set $RunDir to the runtime directory
RunDir=`pwd`

# Set the OS, arch, and OSId
RunOS='linux'
RunOSId='linux'
RunArch='x64'

# Create aot directory 
mkdir -p $RunDir/artifacts/bin/aot/sgen
mkdir -p $RunDir/artifacts/bin/aot/pack
cp -r $RunDir/artifacts/obj/mono/$RunOS.$RunArch.Release/mono/* $RunDir/artifacts/bin/aot/sgen
cp -r $RunDir/artifacts/bin/microsoft.netcore.app.runtime.$RunOS-$RunArch/Release/* $RunDir/artifacts/bin/aot/pack

# Create Core Root
$RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance

# One line run:
python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Tests.Perf_Single*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Tests.Perf_Single* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages --buildTimeout 1200

Windows

# Set $RunDir to the runtime directory
$RunDir="FullPathHere"

# Set the OS, arch, and OSId
RunOS='windows'
RunOSId='win'
RunArch='x64'

# Create aot directory
mkdir $RunDir\artifacts\bin\aot\sgen
mkdir $RunDir\artifacts\bin\aot\pack
xcopy $RunDir\artifacts\obj\mono\$RunOS.$RunArch.Release\mono $RunDir\artifacts\bin\aot\sgen\ /e /y
xcopy $RunDir\artifacts\bin\microsoft.netcore.app.runtime.$RunOSId-$RunArch\Release $RunDir\artifacts\bin\aot\pack\ /e /y

# Create Core Root
$RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance

# One line run:
python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Tests.Perf_Single*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Tests.Perf_Single* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack -aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages --buildTimeout 1200

System.Tests.Perf_Single.ToStringWithFormat(value: 3.4028235E+38, format: "G17")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Single.ToStringWithFormat(value: 3.4028235E+38, format: "E")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Single.ToString(value: -3.4028235E+38)

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Single.ToStringWithFormat(value: 3.4028235E+38, format: "R")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Single.ToStringWithFormat(value: 12345, format: "R")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Single.ToStringWithFormat(value: 12345, format: "G17")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Single.ToStringWithFormat(value: 12345, format: "G")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Single.ToStringWithCultureInfo(value: 12345, culture: zh)

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Single.ToString(value: 3.4028235E+38)

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Single.ToStringWithFormat(value: -3.4028235E+38, format: "G")

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Single.ToString(value: 12345)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added arch-x64 os-linux Linux OS (any supported distro) runtime-mono specific to the Mono runtime untriaged New issue has not been triaged by the area owner labels Jun 11, 2024
Copy link
Author

Run Information

Name Value
Architecture x64
OS ubuntu 22.04
Queue TigerUbuntu
Baseline 59e8bbcf83b664c3de6cfa553d9bbfad76578765
Compare 9d02188cdd26d4dfc26e3f9d4e843c6ae78c1b1c
Diff Diff
Configs CompilationMode:tiered, LLVM:true, MonoAOT:true, MonoInterpreter:false, RunKind:micro_mono

Regressions in Benchstone.BenchI.Ackermann

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
4.05 μs 5.07 μs 1.25 0.03 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Repro Steps

Prerequisites (Files either built locally (with build.(sh/cmd) or downloaded from payload above (if same system setup) (in this order))

  • Libraries build extracted to runtime/artifacts or build instructions: Libraries README args: -subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0
  • CoreCLR product build extracted to runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release, build instructions: CoreCLR README args: -subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0
  • AOT MONO build extracted to runtime/artifacts/bin/mono/$RunOS.$RunArch.Release, build instructions: MONO README args: -arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release /p:CrossBuild=false /p:MonoLLVMUseCxx11Abi=false
  • Dotnet SDK installed for dotnet commands
  • Running commands from the runtime folder

Linux

# Set $RunDir to the runtime directory
RunDir=`pwd`

# Set the OS, arch, and OSId
RunOS='linux'
RunOSId='linux'
RunArch='x64'

# Create aot directory 
mkdir -p $RunDir/artifacts/bin/aot/sgen
mkdir -p $RunDir/artifacts/bin/aot/pack
cp -r $RunDir/artifacts/obj/mono/$RunOS.$RunArch.Release/mono/* $RunDir/artifacts/bin/aot/sgen
cp -r $RunDir/artifacts/bin/microsoft.netcore.app.runtime.$RunOS-$RunArch/Release/* $RunDir/artifacts/bin/aot/pack

# Create Core Root
$RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance

# One line run:
python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'Benchstone.BenchI.Ackermann*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter Benchstone.BenchI.Ackermann* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages --buildTimeout 1200

Windows

# Set $RunDir to the runtime directory
$RunDir="FullPathHere"

# Set the OS, arch, and OSId
RunOS='windows'
RunOSId='win'
RunArch='x64'

# Create aot directory
mkdir $RunDir\artifacts\bin\aot\sgen
mkdir $RunDir\artifacts\bin\aot\pack
xcopy $RunDir\artifacts\obj\mono\$RunOS.$RunArch.Release\mono $RunDir\artifacts\bin\aot\sgen\ /e /y
xcopy $RunDir\artifacts\bin\microsoft.netcore.app.runtime.$RunOSId-$RunArch\Release $RunDir\artifacts\bin\aot\pack\ /e /y

# Create Core Root
$RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance

# One line run:
python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'Benchstone.BenchI.Ackermann*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter Benchstone.BenchI.Ackermann* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack -aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages --buildTimeout 1200

Benchstone.BenchI.Ackermann.Test

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS ubuntu 22.04
Queue TigerUbuntu
Baseline 59e8bbcf83b664c3de6cfa553d9bbfad76578765
Compare 9d02188cdd26d4dfc26e3f9d4e843c6ae78c1b1c
Diff Diff
Configs CompilationMode:tiered, LLVM:true, MonoAOT:true, MonoInterpreter:false, RunKind:micro_mono

Regressions in Span.Sorting

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
255.11 μs 273.06 μs 1.07 0.03 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Repro Steps

Prerequisites (Files either built locally (with build.(sh/cmd) or downloaded from payload above (if same system setup) (in this order))

  • Libraries build extracted to runtime/artifacts or build instructions: Libraries README args: -subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0
  • CoreCLR product build extracted to runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release, build instructions: CoreCLR README args: -subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0
  • AOT MONO build extracted to runtime/artifacts/bin/mono/$RunOS.$RunArch.Release, build instructions: MONO README args: -arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release /p:CrossBuild=false /p:MonoLLVMUseCxx11Abi=false
  • Dotnet SDK installed for dotnet commands
  • Running commands from the runtime folder

Linux

# Set $RunDir to the runtime directory
RunDir=`pwd`

# Set the OS, arch, and OSId
RunOS='linux'
RunOSId='linux'
RunArch='x64'

# Create aot directory 
mkdir -p $RunDir/artifacts/bin/aot/sgen
mkdir -p $RunDir/artifacts/bin/aot/pack
cp -r $RunDir/artifacts/obj/mono/$RunOS.$RunArch.Release/mono/* $RunDir/artifacts/bin/aot/sgen
cp -r $RunDir/artifacts/bin/microsoft.netcore.app.runtime.$RunOS-$RunArch/Release/* $RunDir/artifacts/bin/aot/pack

# Create Core Root
$RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance

# One line run:
python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'Span.Sorting*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter Span.Sorting* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages --buildTimeout 1200

Windows

# Set $RunDir to the runtime directory
$RunDir="FullPathHere"

# Set the OS, arch, and OSId
RunOS='windows'
RunOSId='win'
RunArch='x64'

# Create aot directory
mkdir $RunDir\artifacts\bin\aot\sgen
mkdir $RunDir\artifacts\bin\aot\pack
xcopy $RunDir\artifacts\obj\mono\$RunOS.$RunArch.Release\mono $RunDir\artifacts\bin\aot\sgen\ /e /y
xcopy $RunDir\artifacts\bin\microsoft.netcore.app.runtime.$RunOSId-$RunArch\Release $RunDir\artifacts\bin\aot\pack\ /e /y

# Create Core Root
$RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance

# One line run:
python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'Span.Sorting*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter Span.Sorting* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack -aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages --buildTimeout 1200

Span.Sorting.BubbleSortSpan(Size: 512)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS ubuntu 22.04
Queue TigerUbuntu
Baseline 59e8bbcf83b664c3de6cfa553d9bbfad76578765
Compare 9d02188cdd26d4dfc26e3f9d4e843c6ae78c1b1c
Diff Diff
Configs CompilationMode:tiered, LLVM:true, MonoAOT:true, MonoInterpreter:false, RunKind:micro_mono

Regressions in Benchstone.BenchI.AddArray2

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
10.66 ms 13.95 ms 1.31 0.15 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Repro Steps

Prerequisites (Files either built locally (with build.(sh/cmd) or downloaded from payload above (if same system setup) (in this order))

  • Libraries build extracted to runtime/artifacts or build instructions: Libraries README args: -subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0
  • CoreCLR product build extracted to runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release, build instructions: CoreCLR README args: -subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0
  • AOT MONO build extracted to runtime/artifacts/bin/mono/$RunOS.$RunArch.Release, build instructions: MONO README args: -arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release /p:CrossBuild=false /p:MonoLLVMUseCxx11Abi=false
  • Dotnet SDK installed for dotnet commands
  • Running commands from the runtime folder

Linux

# Set $RunDir to the runtime directory
RunDir=`pwd`

# Set the OS, arch, and OSId
RunOS='linux'
RunOSId='linux'
RunArch='x64'

# Create aot directory 
mkdir -p $RunDir/artifacts/bin/aot/sgen
mkdir -p $RunDir/artifacts/bin/aot/pack
cp -r $RunDir/artifacts/obj/mono/$RunOS.$RunArch.Release/mono/* $RunDir/artifacts/bin/aot/sgen
cp -r $RunDir/artifacts/bin/microsoft.netcore.app.runtime.$RunOS-$RunArch/Release/* $RunDir/artifacts/bin/aot/pack

# Create Core Root
$RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance

# One line run:
python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'Benchstone.BenchI.AddArray2*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter Benchstone.BenchI.AddArray2* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages --buildTimeout 1200

Windows

# Set $RunDir to the runtime directory
$RunDir="FullPathHere"

# Set the OS, arch, and OSId
RunOS='windows'
RunOSId='win'
RunArch='x64'

# Create aot directory
mkdir $RunDir\artifacts\bin\aot\sgen
mkdir $RunDir\artifacts\bin\aot\pack
xcopy $RunDir\artifacts\obj\mono\$RunOS.$RunArch.Release\mono $RunDir\artifacts\bin\aot\sgen\ /e /y
xcopy $RunDir\artifacts\bin\microsoft.netcore.app.runtime.$RunOSId-$RunArch\Release $RunDir\artifacts\bin\aot\pack\ /e /y

# Create Core Root
$RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance

# One line run:
python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'Benchstone.BenchI.AddArray2*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter Benchstone.BenchI.AddArray2* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack -aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages --buildTimeout 1200

Benchstone.BenchI.AddArray2.Test

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS ubuntu 22.04
Queue TigerUbuntu
Baseline 59e8bbcf83b664c3de6cfa553d9bbfad76578765
Compare 9d02188cdd26d4dfc26e3f9d4e843c6ae78c1b1c
Diff Diff
Configs CompilationMode:tiered, LLVM:true, MonoAOT:true, MonoInterpreter:false, RunKind:micro_mono

Regressions in PerfLabTests.CastingPerf

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
43.82 μs 87.45 μs 2.00 0.02 False
88.35 μs 132.18 μs 1.50 0.02 False
147.02 μs 180.74 μs 1.23 0.05 False
43.77 μs 87.36 μs 2.00 0.02 False

graph
graph
graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Repro Steps

Prerequisites (Files either built locally (with build.(sh/cmd) or downloaded from payload above (if same system setup) (in this order))

  • Libraries build extracted to runtime/artifacts or build instructions: Libraries README args: -subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0
  • CoreCLR product build extracted to runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release, build instructions: CoreCLR README args: -subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0
  • AOT MONO build extracted to runtime/artifacts/bin/mono/$RunOS.$RunArch.Release, build instructions: MONO README args: -arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release /p:CrossBuild=false /p:MonoLLVMUseCxx11Abi=false
  • Dotnet SDK installed for dotnet commands
  • Running commands from the runtime folder

Linux

# Set $RunDir to the runtime directory
RunDir=`pwd`

# Set the OS, arch, and OSId
RunOS='linux'
RunOSId='linux'
RunArch='x64'

# Create aot directory 
mkdir -p $RunDir/artifacts/bin/aot/sgen
mkdir -p $RunDir/artifacts/bin/aot/pack
cp -r $RunDir/artifacts/obj/mono/$RunOS.$RunArch.Release/mono/* $RunDir/artifacts/bin/aot/sgen
cp -r $RunDir/artifacts/bin/microsoft.netcore.app.runtime.$RunOS-$RunArch/Release/* $RunDir/artifacts/bin/aot/pack

# Create Core Root
$RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance

# One line run:
python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'PerfLabTests.CastingPerf*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter PerfLabTests.CastingPerf* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages --buildTimeout 1200

Windows

# Set $RunDir to the runtime directory
$RunDir="FullPathHere"

# Set the OS, arch, and OSId
RunOS='windows'
RunOSId='win'
RunArch='x64'

# Create aot directory
mkdir $RunDir\artifacts\bin\aot\sgen
mkdir $RunDir\artifacts\bin\aot\pack
xcopy $RunDir\artifacts\obj\mono\$RunOS.$RunArch.Release\mono $RunDir\artifacts\bin\aot\sgen\ /e /y
xcopy $RunDir\artifacts\bin\microsoft.netcore.app.runtime.$RunOSId-$RunArch\Release $RunDir\artifacts\bin\aot\pack\ /e /y

# Create Core Root
$RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance

# One line run:
python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'PerfLabTests.CastingPerf*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter PerfLabTests.CastingPerf* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack -aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages --buildTimeout 1200

PerfLabTests.CastingPerf.CheckObjIsInterfaceNo

ETL Files

Histogram

JIT Disasms

PerfLabTests.CastingPerf.CheckArrayIsInterfaceNo

ETL Files

Histogram

JIT Disasms

PerfLabTests.CastingPerf.FooObjIsDescendant

ETL Files

Histogram

JIT Disasms

PerfLabTests.CastingPerf.CheckIsInstAnyIsInterfaceNo

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS ubuntu 22.04
Queue TigerUbuntu
Baseline 59e8bbcf83b664c3de6cfa553d9bbfad76578765
Compare 9d02188cdd26d4dfc26e3f9d4e843c6ae78c1b1c
Diff Diff
Configs CompilationMode:tiered, LLVM:true, MonoAOT:true, MonoInterpreter:false, RunKind:micro_mono

Regressions in System.Collections.IterateForEachNonGeneric<Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
1.86 μs 2.12 μs 1.14 0.08 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Repro Steps

Prerequisites (Files either built locally (with build.(sh/cmd) or downloaded from payload above (if same system setup) (in this order))

  • Libraries build extracted to runtime/artifacts or build instructions: Libraries README args: -subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0
  • CoreCLR product build extracted to runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release, build instructions: CoreCLR README args: -subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0
  • AOT MONO build extracted to runtime/artifacts/bin/mono/$RunOS.$RunArch.Release, build instructions: MONO README args: -arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release /p:CrossBuild=false /p:MonoLLVMUseCxx11Abi=false
  • Dotnet SDK installed for dotnet commands
  • Running commands from the runtime folder

Linux

# Set $RunDir to the runtime directory
RunDir=`pwd`

# Set the OS, arch, and OSId
RunOS='linux'
RunOSId='linux'
RunArch='x64'

# Create aot directory 
mkdir -p $RunDir/artifacts/bin/aot/sgen
mkdir -p $RunDir/artifacts/bin/aot/pack
cp -r $RunDir/artifacts/obj/mono/$RunOS.$RunArch.Release/mono/* $RunDir/artifacts/bin/aot/sgen
cp -r $RunDir/artifacts/bin/microsoft.netcore.app.runtime.$RunOS-$RunArch/Release/* $RunDir/artifacts/bin/aot/pack

# Create Core Root
$RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance

# One line run:
python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Collections.IterateForEachNonGeneric&lt;Int32&gt;*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Collections.IterateForEachNonGeneric&lt;Int32&gt;* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages --buildTimeout 1200

Windows

# Set $RunDir to the runtime directory
$RunDir="FullPathHere"

# Set the OS, arch, and OSId
RunOS='windows'
RunOSId='win'
RunArch='x64'

# Create aot directory
mkdir $RunDir\artifacts\bin\aot\sgen
mkdir $RunDir\artifacts\bin\aot\pack
xcopy $RunDir\artifacts\obj\mono\$RunOS.$RunArch.Release\mono $RunDir\artifacts\bin\aot\sgen\ /e /y
xcopy $RunDir\artifacts\bin\microsoft.netcore.app.runtime.$RunOSId-$RunArch\Release $RunDir\artifacts\bin\aot\pack\ /e /y

# Create Core Root
$RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance

# One line run:
python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Collections.IterateForEachNonGeneric&lt;Int32&gt;*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Collections.IterateForEachNonGeneric&lt;Int32&gt;* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack -aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages --buildTimeout 1200

System.Collections.IterateForEachNonGeneric<Int32>.Stack(Size: 512)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS ubuntu 22.04
Queue TigerUbuntu
Baseline 59e8bbcf83b664c3de6cfa553d9bbfad76578765
Compare 9d02188cdd26d4dfc26e3f9d4e843c6ae78c1b1c
Diff Diff
Configs CompilationMode:tiered, LLVM:true, MonoAOT:true, MonoInterpreter:false, RunKind:micro_mono

Regressions in System.Globalization.Tests.Perf_NumberCultureInfo

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
157.23 ns 197.97 ns 1.26 0.05 False
160.72 ns 196.99 ns 1.23 0.03 True

graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Repro Steps

Prerequisites (Files either built locally (with build.(sh/cmd) or downloaded from payload above (if same system setup) (in this order))

  • Libraries build extracted to runtime/artifacts or build instructions: Libraries README args: -subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0
  • CoreCLR product build extracted to runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release, build instructions: CoreCLR README args: -subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0
  • AOT MONO build extracted to runtime/artifacts/bin/mono/$RunOS.$RunArch.Release, build instructions: MONO README args: -arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release /p:CrossBuild=false /p:MonoLLVMUseCxx11Abi=false
  • Dotnet SDK installed for dotnet commands
  • Running commands from the runtime folder

Linux

# Set $RunDir to the runtime directory
RunDir=`pwd`

# Set the OS, arch, and OSId
RunOS='linux'
RunOSId='linux'
RunArch='x64'

# Create aot directory 
mkdir -p $RunDir/artifacts/bin/aot/sgen
mkdir -p $RunDir/artifacts/bin/aot/pack
cp -r $RunDir/artifacts/obj/mono/$RunOS.$RunArch.Release/mono/* $RunDir/artifacts/bin/aot/sgen
cp -r $RunDir/artifacts/bin/microsoft.netcore.app.runtime.$RunOS-$RunArch/Release/* $RunDir/artifacts/bin/aot/pack

# Create Core Root
$RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance

# One line run:
python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Globalization.Tests.Perf_NumberCultureInfo*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Globalization.Tests.Perf_NumberCultureInfo* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages --buildTimeout 1200

Windows

# Set $RunDir to the runtime directory
$RunDir="FullPathHere"

# Set the OS, arch, and OSId
RunOS='windows'
RunOSId='win'
RunArch='x64'

# Create aot directory
mkdir $RunDir\artifacts\bin\aot\sgen
mkdir $RunDir\artifacts\bin\aot\pack
xcopy $RunDir\artifacts\obj\mono\$RunOS.$RunArch.Release\mono $RunDir\artifacts\bin\aot\sgen\ /e /y
xcopy $RunDir\artifacts\bin\microsoft.netcore.app.runtime.$RunOSId-$RunArch\Release $RunDir\artifacts\bin\aot\pack\ /e /y

# Create Core Root
$RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release

# Clone performance 
git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance

# One line run:
python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Globalization.Tests.Perf_NumberCultureInfo*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime  --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack --aotcompilermode llvm --logBuildOutput --generateBinLog"

# Individual Commands:
# Restore 
dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Build
dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1

# Run
dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Globalization.Tests.Perf_NumberCultureInfo* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack -aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages --buildTimeout 1200

System.Globalization.Tests.Perf_NumberCultureInfo.ToString(culturestring: da)

ETL Files

Histogram

JIT Disasms

System.Globalization.Tests.Perf_NumberCultureInfo.ToString(culturestring: fr)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@matouskozak
Copy link
Member

matouskozak commented Jun 14, 2024

Could the regressions in Perf_Single, Perf_Double, Perf_NumberCultureInfo be related to #102683 @huoyaoyuan ? It looks like there was a discussion on the PR about possible perf implications for Mono. Is the regression expected @fanyang-mono ?

Arm64 Mono AOT-llvm regressions: dotnet/perf-autofiling-issues#36103

@huoyaoyuan
Copy link
Member

The original PR has benchmark on mono and didn't see such regressions on other formats and values: #102683 (comment)

I'll run with the exact values from performance repo again.

@matouskozak
Copy link
Member

matouskozak commented Jun 14, 2024

The original PR has benchmark on mono and didn't see such regressions on other formats and values: dotnet/runtime#102683 (comment)

I'll run with the exact values from performance repo again.

Thank you, you can try using https://github.com/dotnet/performance/blob/main/scripts/benchmarks_local.py script to run the selected microbenchmarks on different commits (note, it currently doesn't support a diff between AOT runs so it has to be calculate) There is a Readme with instructions on how to use it. cc: @LoopedBard3

@huoyaoyuan
Copy link
Member

huoyaoyuan commented Jun 15, 2024

I had some trouble building with benchmark_local.py, just manually comparing the commits a4407a106f883a6593105a660d44da6f8fa4017c and 555dde482e5e76bc0563757b53baee9f0d859a4b.

Results for a4407a106f883a6593105a660d44da6f8fa4017c (After merge):

Method Job OutlierMode MemoryRandomization value format culture Mean Error StdDev Median Min Max Gen0 Allocated
Parse Job-ALNGBP DontRemove True -1.7976931348623157e+308 ? ? 126.47 ns 0.676 ns 0.633 ns 126.45 ns 125.33 ns 128.00 ns - -
TryParse Job-ALNGBP DontRemove True -1.7976931348623157e+308 ? ? 124.03 ns 0.651 ns 0.609 ns 123.75 ns 123.03 ns 125.03 ns - -
Parse Job-ALNGBP DontRemove True 1.7976931348623157e+308 ? ? 129.35 ns 10.418 ns 11.997 ns 126.99 ns 123.90 ns 180.09 ns - -
TryParse Job-ALNGBP DontRemove True 1.7976931348623157e+308 ? ? 123.77 ns 1.088 ns 1.018 ns 123.76 ns 121.67 ns 125.45 ns - -
Parse Job-ALNGBP DontRemove True 12345 ? ? 77.39 ns 0.474 ns 0.443 ns 77.36 ns 76.61 ns 78.22 ns - -
TryParse Job-ALNGBP DontRemove True 12345 ? ? 77.42 ns 0.482 ns 0.451 ns 77.59 ns 76.63 ns 78.11 ns - -
IsNaN Job-LOLVAE Default Default NaN ? ? 1,576,629.60 ns 16,269.034 ns 15,218.064 ns 1,570,184.69 ns 1,560,424.06 ns 1,611,865.31 ns - 2 B
ToStringWithFormat Job-ALNGBP DontRemove True -1.7976931348623157E+308 E ? 198.98 ns 1.220 ns 1.141 ns 199.15 ns 196.00 ns 200.55 ns 0.0128 56 B
ToStringWithFormat Job-ALNGBP DontRemove True -1.7976931348623157E+308 F50 ? 32,328.03 ns 271.606 ns 254.060 ns 32,360.89 ns 31,860.61 ns 32,813.43 ns 0.1297 744 B
ToStringWithFormat Job-ALNGBP DontRemove True -1.7976931348623157E+308 G ? 287.88 ns 3.412 ns 3.192 ns 289.45 ns 281.24 ns 291.60 ns 0.0173 72 B
ToStringWithFormat Job-ALNGBP DontRemove True -1.7976931348623157E+308 G17 ? 263.73 ns 2.204 ns 2.061 ns 264.20 ns 259.10 ns 266.38 ns 0.0171 72 B
ToStringWithFormat Job-ALNGBP DontRemove True -1.7976931348623157E+308 R ? 289.88 ns 3.318 ns 3.104 ns 289.08 ns 285.72 ns 297.82 ns 0.0174 72 B
ToStringWithCultureInfo Job-ALNGBP DontRemove True -1.7976931348623157E+308 ? zh 284.67 ns 3.312 ns 3.098 ns 283.81 ns 279.49 ns 291.88 ns 0.0173 72 B
ToString Job-ALNGBP DontRemove True -1.7976931348623157E+308 ? ? 287.34 ns 0.974 ns 0.911 ns 287.27 ns 285.58 ns 289.39 ns 0.0173 72 B
IsNaN Job-LOLVAE Default Default 0 ? ? 1,578,909.75 ns 5,812.875 ns 5,437.367 ns 1,579,611.88 ns 1,569,534.38 ns 1,587,322.50 ns - 2 B
ToStringWithFormat Job-ALNGBP DontRemove True 12345 E ? 187.74 ns 1.869 ns 1.748 ns 187.81 ns 184.26 ns 191.71 ns 0.0112 48 B
ToStringWithFormat Job-ALNGBP DontRemove True 12345 F50 ? 609.38 ns 6.156 ns 5.759 ns 610.33 ns 600.17 ns 617.71 ns 0.0311 136 B
ToStringWithFormat Job-ALNGBP DontRemove True 12345 G ? 159.66 ns 3.085 ns 3.030 ns 159.54 ns 154.74 ns 166.89 ns 0.0071 32 B
ToStringWithFormat Job-ALNGBP DontRemove True 12345 G17 ? 372.65 ns 4.898 ns 4.581 ns 372.69 ns 365.90 ns 382.80 ns 0.0073 32 B
ToStringWithFormat Job-ALNGBP DontRemove True 12345 R ? 157.86 ns 2.119 ns 1.983 ns 157.81 ns 153.06 ns 160.91 ns 0.0076 32 B
ToStringWithCultureInfo Job-ALNGBP DontRemove True 12345 ? zh 155.51 ns 1.678 ns 1.570 ns 155.24 ns 153.41 ns 157.69 ns 0.0075 32 B
ToString Job-ALNGBP DontRemove True 12345 ? ? 153.96 ns 2.055 ns 1.922 ns 153.85 ns 150.05 ns 157.32 ns 0.0074 32 B
ToStringWithFormat Job-ALNGBP DontRemove True 1.7976931348623157E+308 E ? 191.94 ns 3.031 ns 2.835 ns 192.16 ns 185.47 ns 197.49 ns 0.0113 48 B
ToStringWithFormat Job-ALNGBP DontRemove True 1.7976931348623157E+308 F50 ? 32,225.35 ns 381.669 ns 357.014 ns 32,125.21 ns 31,734.57 ns 33,000.68 ns 0.1278 744 B
ToStringWithFormat Job-ALNGBP DontRemove True 1.7976931348623157E+308 G ? 286.48 ns 3.008 ns 2.814 ns 287.07 ns 280.52 ns 290.44 ns 0.0174 72 B
ToStringWithFormat Job-ALNGBP DontRemove True 1.7976931348623157E+308 G17 ? 265.71 ns 3.678 ns 3.440 ns 265.49 ns 259.59 ns 275.35 ns 0.0170 72 B
ToStringWithFormat Job-ALNGBP DontRemove True 1.7976931348623157E+308 R ? 295.14 ns 3.140 ns 2.937 ns 294.29 ns 291.84 ns 301.26 ns 0.0174 72 B
ToStringWithCultureInfo Job-ALNGBP DontRemove True 1.7976931348623157E+308 ? zh 287.96 ns 5.406 ns 5.057 ns 286.22 ns 280.15 ns 296.43 ns 0.0170 72 B
ToString Job-ALNGBP DontRemove True 1.7976931348623157E+308 ? ? 287.89 ns 3.275 ns 3.064 ns 286.80 ns 283.82 ns 293.56 ns 0.0163 72 B

Results for 555dde482e5e76bc0563757b53baee9f0d859a4b (Before merge):

Method Job OutlierMode MemoryRandomization value format culture Mean Error StdDev Median Min Max Gen0 Allocated
Parse Job-TTTZYP DontRemove True -1.7976931348623157e+308 ? ? 126.93 ns 1.087 ns 1.017 ns 126.58 ns 125.10 ns 128.33 ns - -
TryParse Job-TTTZYP DontRemove True -1.7976931348623157e+308 ? ? 126.47 ns 0.557 ns 0.521 ns 126.60 ns 125.53 ns 127.50 ns - -
Parse Job-TTTZYP DontRemove True 1.7976931348623157e+308 ? ? 127.95 ns 0.768 ns 0.718 ns 127.93 ns 126.46 ns 129.28 ns - -
TryParse Job-TTTZYP DontRemove True 1.7976931348623157e+308 ? ? 122.27 ns 1.852 ns 1.733 ns 123.14 ns 118.89 ns 124.19 ns - -
Parse Job-TTTZYP DontRemove True 12345 ? ? 76.50 ns 1.453 ns 1.359 ns 76.11 ns 75.12 ns 80.45 ns - -
TryParse Job-TTTZYP DontRemove True 12345 ? ? 74.88 ns 1.029 ns 0.963 ns 74.73 ns 73.41 ns 76.58 ns - -
IsNaN Job-TEQVAO Default Default NaN ? ? 1,566,314.42 ns 3,930.789 ns 3,484.542 ns 1,565,279.06 ns 1,561,193.75 ns 1,573,641.88 ns - 2 B
ToStringWithFormat Job-TTTZYP DontRemove True -1.7976931348623157E+308 E ? 198.04 ns 2.621 ns 2.451 ns 197.74 ns 195.12 ns 204.71 ns 0.0127 56 B
ToStringWithFormat Job-TTTZYP DontRemove True -1.7976931348623157E+308 F50 ? 34,178.47 ns 735.195 ns 846.652 ns 34,508.87 ns 32,447.92 ns 34,931.41 ns 0.1299 744 B
ToStringWithFormat Job-TTTZYP DontRemove True -1.7976931348623157E+308 G ? 286.63 ns 1.626 ns 1.521 ns 286.63 ns 282.45 ns 289.58 ns 0.0173 72 B
ToStringWithFormat Job-TTTZYP DontRemove True -1.7976931348623157E+308 G17 ? 263.57 ns 1.688 ns 1.579 ns 263.21 ns 261.43 ns 267.93 ns 0.0168 72 B
ToStringWithFormat Job-TTTZYP DontRemove True -1.7976931348623157E+308 R ? 289.32 ns 1.933 ns 1.808 ns 289.16 ns 286.58 ns 292.85 ns 0.0174 72 B
ToStringWithCultureInfo Job-TTTZYP DontRemove True -1.7976931348623157E+308 ? zh 285.50 ns 1.880 ns 1.758 ns 285.51 ns 281.62 ns 288.28 ns 0.0173 72 B
ToString Job-TTTZYP DontRemove True -1.7976931348623157E+308 ? ? 285.61 ns 1.176 ns 1.100 ns 285.42 ns 284.06 ns 288.04 ns 0.0171 72 B
IsNaN Job-TEQVAO Default Default 0 ? ? 1,590,207.21 ns 5,467.024 ns 5,113.858 ns 1,590,451.25 ns 1,579,600.00 ns 1,597,880.00 ns - 2 B
ToStringWithFormat Job-TTTZYP DontRemove True 12345 E ? 184.10 ns 2.447 ns 2.289 ns 184.45 ns 180.76 ns 188.04 ns 0.0112 48 B
ToStringWithFormat Job-TTTZYP DontRemove True 12345 F50 ? 583.29 ns 5.609 ns 5.247 ns 581.82 ns 574.89 ns 593.43 ns 0.0322 136 B
ToStringWithFormat Job-TTTZYP DontRemove True 12345 G ? 155.27 ns 1.181 ns 1.105 ns 154.77 ns 154.29 ns 157.71 ns 0.0075 32 B
ToStringWithFormat Job-TTTZYP DontRemove True 12345 G17 ? 360.51 ns 1.653 ns 1.546 ns 361.16 ns 356.68 ns 363.27 ns 0.0073 32 B
ToStringWithFormat Job-TTTZYP DontRemove True 12345 R ? 156.85 ns 0.792 ns 0.741 ns 157.00 ns 155.59 ns 158.23 ns 0.0076 32 B
ToStringWithCultureInfo Job-TTTZYP DontRemove True 12345 ? zh 151.58 ns 0.445 ns 0.416 ns 151.37 ns 150.89 ns 152.27 ns 0.0073 32 B
ToString Job-TTTZYP DontRemove True 12345 ? ? 151.66 ns 0.533 ns 0.498 ns 151.73 ns 150.62 ns 152.36 ns 0.0073 32 B
ToStringWithFormat Job-TTTZYP DontRemove True 1.7976931348623157E+308 E ? 186.51 ns 0.921 ns 0.861 ns 186.22 ns 185.29 ns 188.16 ns 0.0113 48 B
ToStringWithFormat Job-TTTZYP DontRemove True 1.7976931348623157E+308 F50 ? 32,306.26 ns 167.847 ns 157.004 ns 32,318.58 ns 32,046.27 ns 32,515.91 ns 0.1281 744 B
ToStringWithFormat Job-TTTZYP DontRemove True 1.7976931348623157E+308 G ? 283.88 ns 4.037 ns 3.776 ns 284.95 ns 276.21 ns 288.82 ns 0.0170 72 B
ToStringWithFormat Job-TTTZYP DontRemove True 1.7976931348623157E+308 G17 ? 257.50 ns 2.654 ns 2.482 ns 258.13 ns 251.57 ns 260.13 ns 0.0174 72 B
ToStringWithFormat Job-TTTZYP DontRemove True 1.7976931348623157E+308 R ? 284.75 ns 5.005 ns 4.682 ns 284.70 ns 275.10 ns 292.99 ns 0.0170 72 B
ToStringWithCultureInfo Job-TTTZYP DontRemove True 1.7976931348623157E+308 ? zh 292.01 ns 3.279 ns 3.067 ns 292.11 ns 287.17 ns 297.15 ns 0.0174 72 B
ToString Job-TTTZYP DontRemove True 1.7976931348623157E+308 ? ? 286.36 ns 5.454 ns 5.102 ns 285.15 ns 280.24 ns 294.59 ns 0.0172 72 B

The difference is generally less than 5%, so it doesn't seem to be caused by #102683

@matouskozak
Copy link
Member

matouskozak commented Jun 17, 2024

I had some trouble building with benchmark_local.py, just manually comparing the commits a4407a106f883a6593105a660d44da6f8fa4017c and 555dde482e5e76bc0563757b53baee9f0d859a4b.

Results for a4407a106f883a6593105a660d44da6f8fa4017c (After merge):

Results for 555dde482e5e76bc0563757b53baee9f0d859a4b (Before merge):

The difference is generally less than 5%, so it doesn't seem to be caused by dotnet/runtime#102683

Could you please share what issues did you encounter?

I would use something like: python3 benchmarks_local.py --commits a4407a106f883a6593105a660d44da6f8fa4017c 555dde482e5e76bc0563757b53baee9f0d859a4b --run-types MonoAOTLLVM --filter "*Perf_Double.ToString*" to run the Perf_Double.ToString* microbenchmarks.

@huoyaoyuan
Copy link
Member

I'm using the similar command line with --allow-non-admin-execution. It fails for the pack step with following errors:

[INFO] C:\Users\Meow\.nuget\packages\microsoft.dotnet.sharedframework.sdk\9.0.0-beta.24281.1\targets\sharedfx.targets(296,5): error : The following files are missing entries in the templated manifest: [C:\Users\Meow\runtime\runtime\src\installer\pkg\sfx\Microsoft.NETCore.App\Microsoft.NETCore.App.Runtime.sfxproj]
[INFO] C:\Users\Meow\.nuget\packages\microsoft.dotnet.sharedframework.sdk\9.0.0-beta.24281.1\targets\sharedfx.targets(296,5): error : llc.exe [C:\Users\Meow\runtime\runtime\src\installer\pkg\sfx\Microsoft.NETCore.App\Microsoft.NETCore.App.Runtime.sfxproj]
[INFO] C:\Users\Meow\.nuget\packages\microsoft.dotnet.sharedframework.sdk\9.0.0-beta.24281.1\targets\sharedfx.targets(296,5): error : opt.exe. Add these file names with extensions to the 'PlatformManifestFileEntry' item group for the runtime pack and corresponding ref pack to include them in the platform manifest. [C:\Users\Meow\runtime\runtime\src\installer\pkg\sfx\Microsoft.NETCore.App\Microsoft.NETCore.App.Runtime.sfxproj]

Can you confirm that the regression is unrelated to #102683?

@matouskozak
Copy link
Member

matouskozak commented Jun 18, 2024

I'm using the similar command line with --allow-non-admin-execution. It fails for the pack step with following errors:

[INFO] C:\Users\Meow\.nuget\packages\microsoft.dotnet.sharedframework.sdk\9.0.0-beta.24281.1\targets\sharedfx.targets(296,5): error : The following files are missing entries in the templated manifest: [C:\Users\Meow\runtime\runtime\src\installer\pkg\sfx\Microsoft.NETCore.App\Microsoft.NETCore.App.Runtime.sfxproj]
[INFO] C:\Users\Meow\.nuget\packages\microsoft.dotnet.sharedframework.sdk\9.0.0-beta.24281.1\targets\sharedfx.targets(296,5): error : llc.exe [C:\Users\Meow\runtime\runtime\src\installer\pkg\sfx\Microsoft.NETCore.App\Microsoft.NETCore.App.Runtime.sfxproj]
[INFO] C:\Users\Meow\.nuget\packages\microsoft.dotnet.sharedframework.sdk\9.0.0-beta.24281.1\targets\sharedfx.targets(296,5): error : opt.exe. Add these file names with extensions to the 'PlatformManifestFileEntry' item group for the runtime pack and corresponding ref pack to include them in the platform manifest. [C:\Users\Meow\runtime\runtime\src\installer\pkg\sfx\Microsoft.NETCore.App\Microsoft.NETCore.App.Runtime.sfxproj]

Can you confirm that the regression is unrelated to dotnet/runtime#102683?

Thank you for the error log. It looks similar to #103486. Do you think it could be related @LoopedBard3 ?

I've done a quick run with sudo and got:

| Method                  | value                    | format | culture | Mean        | Error     | StdDev    | Median      | Min         | Max         | Gen0   | Allocated |
|------------------------ |------------------------- |------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|-------:|----------:|
| ToStringWithFormat      | -1.7976931348623157E+308 | E      | ?       |    230.2 ns |   2.93 ns |   2.74 ns |    229.3 ns |    227.6 ns |    237.0 ns | 0.0129 |      56 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | F50    | ?       | 24,636.6 ns | 206.05 ns | 192.74 ns | 24,556.4 ns | 24,446.0 ns | 25,197.7 ns | 0.1956 |     744 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | G      | ?       |    312.7 ns |   4.43 ns |   4.14 ns |    311.3 ns |    308.5 ns |    324.5 ns | 0.0172 |      72 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | G17    | ?       |    292.8 ns |   2.50 ns |   2.34 ns |    292.2 ns |    290.0 ns |    299.0 ns | 0.0164 |      72 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | R      | ?       |    306.8 ns |   3.82 ns |   3.57 ns |    306.0 ns |    303.7 ns |    316.4 ns | 0.0163 |      72 B |
| ToStringWithCultureInfo | -1.7976931348623157E+308 | ?      | zh      |    307.5 ns |   3.37 ns |   3.15 ns |    306.8 ns |    303.3 ns |    316.0 ns | 0.0174 |      72 B |
| ToString                | -1.7976931348623157E+308 | ?      | ?       |    305.0 ns |   2.14 ns |   2.00 ns |    304.4 ns |    302.4 ns |    310.7 ns | 0.0174 |      72 B |
| ToStringWithFormat      | 12345                    | E      | ?       |    218.9 ns |   1.52 ns |   1.42 ns |    218.4 ns |    217.6 ns |    223.4 ns | 0.0108 |      48 B |
| ToStringWithFormat      | 12345                    | F50    | ?       |    597.6 ns |   7.77 ns |   7.27 ns |    595.3 ns |    588.4 ns |    615.1 ns | 0.0304 |     136 B |
| ToStringWithFormat      | 12345                    | G      | ?       |    140.9 ns |   1.93 ns |   1.80 ns |    140.3 ns |    139.3 ns |    145.8 ns | 0.0075 |      32 B |
| ToStringWithFormat      | 12345                    | G17    | ?       |    257.2 ns |   3.50 ns |   3.27 ns |    256.3 ns |    254.9 ns |    268.2 ns | 0.0075 |      32 B |
| ToStringWithFormat      | 12345                    | R      | ?       |    142.3 ns |   1.78 ns |   1.67 ns |    141.8 ns |    140.5 ns |    145.9 ns | 0.0075 |      32 B |
| ToStringWithCultureInfo | 12345                    | ?      | zh      |    140.9 ns |   1.85 ns |   1.73 ns |    140.6 ns |    138.9 ns |    146.2 ns | 0.0075 |      32 B |
| ToString                | 12345                    | ?      | ?       |    139.1 ns |   1.33 ns |   1.24 ns |    138.6 ns |    138.1 ns |    142.5 ns | 0.0075 |      32 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | E      | ?       |    207.4 ns |   3.64 ns |   3.41 ns |    206.1 ns |    205.0 ns |    217.6 ns | 0.0107 |      48 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | F50    | ?       | 25,054.9 ns | 842.28 ns | 969.97 ns | 24,493.2 ns | 24,407.7 ns | 28,158.3 ns | 0.1956 |     744 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | G      | ?       |    298.3 ns |   4.52 ns |   4.23 ns |    296.5 ns |    294.8 ns |    309.4 ns | 0.0172 |      72 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | G17    | ?       |    281.6 ns |   3.36 ns |   3.14 ns |    281.1 ns |    278.6 ns |    290.7 ns | 0.0166 |      72 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | R      | ?       |    298.0 ns |   4.09 ns |   3.83 ns |    296.9 ns |    294.2 ns |    308.4 ns | 0.0170 |      72 B |
| ToStringWithCultureInfo | 1.7976931348623157E+308  | ?      | zh      |    307.2 ns |   2.83 ns |   2.65 ns |    306.3 ns |    303.8 ns |    312.6 ns | 0.0166 |      72 B |
| ToString                | 1.7976931348623157E+308  | ?      | ?       |    300.5 ns |   5.17 ns |   4.84 ns |    298.9 ns |    296.4 ns |    312.5 ns | 0.0166 |      72 B |

and

| Method                  | value                    | format | culture | Mean        | Error     | StdDev    | Median      | Min         | Max         | Gen0   | Allocated |
|------------------------ |------------------------- |------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|-------:|----------:|
| ToStringWithFormat      | -1.7976931348623157E+308 | E      | ?       |    285.6 ns |   5.49 ns |   5.13 ns |    284.4 ns |    279.3 ns |    296.0 ns | 0.0127 |      56 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | F50    | ?       | 24,635.2 ns | 427.95 ns | 400.31 ns | 24,470.8 ns | 24,345.2 ns | 25,663.1 ns | 0.0989 |     744 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | G      | ?       |    396.5 ns |   4.04 ns |   3.78 ns |    395.5 ns |    393.4 ns |    408.9 ns | 0.0163 |      72 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | G17    | ?       |    380.3 ns |   5.20 ns |   4.86 ns |    379.0 ns |    374.1 ns |    392.0 ns | 0.0164 |      72 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | R      | ?       |    371.0 ns |   2.26 ns |   2.12 ns |    371.0 ns |    367.8 ns |    375.4 ns | 0.0167 |      72 B |
| ToStringWithCultureInfo | -1.7976931348623157E+308 | ?      | zh      |    369.2 ns |   4.19 ns |   3.92 ns |    367.6 ns |    366.3 ns |    380.9 ns | 0.0174 |      72 B |
| ToString                | -1.7976931348623157E+308 | ?      | ?       |    369.4 ns |   5.73 ns |   5.36 ns |    367.9 ns |    364.4 ns |    385.2 ns | 0.0161 |      72 B |
| ToStringWithFormat      | 12345                    | E      | ?       |    278.2 ns |   2.33 ns |   2.18 ns |    277.4 ns |    275.8 ns |    283.6 ns | 0.0113 |      48 B |
| ToStringWithFormat      | 12345                    | F50    | ?       |    624.6 ns |  12.55 ns |  12.89 ns |    622.5 ns |    608.5 ns |    655.1 ns | 0.0306 |     136 B |
| ToStringWithFormat      | 12345                    | G      | ?       |    200.9 ns |   3.23 ns |   3.02 ns |    200.0 ns |    197.7 ns |    207.7 ns | 0.0073 |      32 B |
| ToStringWithFormat      | 12345                    | G17    | ?       |    295.4 ns |   9.66 ns |  11.12 ns |    291.7 ns |    288.5 ns |    339.3 ns | 0.0067 |      32 B |
| ToStringWithFormat      | 12345                    | R      | ?       |    205.2 ns |   4.30 ns |   4.95 ns |    203.6 ns |    200.4 ns |    217.7 ns | 0.0075 |      32 B |
| ToStringWithCultureInfo | 12345                    | ?      | zh      |    202.5 ns |   4.07 ns |   3.81 ns |    200.8 ns |    198.9 ns |    212.8 ns | 0.0072 |      32 B |
| ToString                | 12345                    | ?      | ?       |    195.6 ns |   1.31 ns |   1.23 ns |    195.2 ns |    194.2 ns |    197.9 ns | 0.0072 |      32 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | E      | ?       |    270.9 ns |   1.95 ns |   1.83 ns |    270.2 ns |    268.5 ns |    274.3 ns | 0.0109 |      48 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | F50    | ?       | 24,587.2 ns | 214.57 ns | 200.71 ns | 24,505.6 ns | 24,411.7 ns | 25,138.2 ns | 0.1950 |     744 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | G      | ?       |    354.2 ns |   6.75 ns |   6.31 ns |    352.6 ns |    348.5 ns |    372.8 ns | 0.0165 |      72 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | G17    | ?       |    342.4 ns |   6.53 ns |   6.42 ns |    340.1 ns |    335.3 ns |    361.4 ns | 0.0164 |      72 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | R      | ?       |    370.1 ns |   5.50 ns |   5.14 ns |    368.0 ns |    365.8 ns |    382.3 ns | 0.0174 |      72 B |
| ToStringWithCultureInfo | 1.7976931348623157E+308  | ?      | zh      |    354.9 ns |  11.71 ns |  13.48 ns |    350.1 ns |    342.6 ns |    389.4 ns | 0.0172 |      72 B |
| ToString                | 1.7976931348623157E+308  | ?      | ?       |    364.6 ns |   5.38 ns |   5.03 ns |    362.5 ns |    359.8 ns |    379.1 ns | 0.0164 |      72 B |

Looks like there is a measurable regression between the 2 commits.

@LoopedBard3
Copy link
Member

Yup, I think the error looks related to #103486, or at the very least another set of files that need to be added to the list.

@matouskozak matouskozak transferred this issue from dotnet/perf-autofiling-issues Jun 27, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jun 27, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jun 27, 2024
@matouskozak matouskozak added area-System.Numerics and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jun 27, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

@matouskozak
Copy link
Member

@huoyaoyuan In my last comment I shared the results reproducing the regression. Do you need any more assistance with measurements or something else?

@huoyaoyuan
Copy link
Member

I were just busy these days. Will try to run it tomorrow.

@huoyaoyuan
Copy link
Member

I'm finally able to execute the benchmark script under WSL. On Windows it continuously fails with different attempts.

| Method                  | value                    | format | culture | Mean        | Error     | StdDev    | Median      | Min         | Max         | Gen0   | Allocated |
|------------------------ |------------------------- |------- |-------- |------------:|----------:|----------:|------------:|------------:|------------:|-------:|----------:|
| ToStringWithFormat      | -1.7976931348623157E+308 | E      | ?       |    179.2 ns |   1.23 ns |   1.15 ns |    179.5 ns |    176.7 ns |    181.5 ns | 0.0128 |      56 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | F50    | ?       | 16,429.3 ns | 163.75 ns | 153.18 ns | 16,419.9 ns | 16,100.6 ns | 16,710.7 ns | 0.1941 |     744 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | G      | ?       |    268.3 ns |   2.78 ns |   2.60 ns |    267.5 ns |    264.6 ns |    274.8 ns | 0.0173 |      72 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | G17    | ?       |    252.3 ns |   2.48 ns |   2.32 ns |    252.7 ns |    248.1 ns |    256.9 ns | 0.0172 |      72 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | R      | ?       |    267.2 ns |   2.89 ns |   2.70 ns |    267.1 ns |    263.9 ns |    274.6 ns | 0.0174 |      72 B |
| ToStringWithCultureInfo | -1.7976931348623157E+308 | ?      | zh      |    260.4 ns |   3.86 ns |   3.61 ns |    260.6 ns |    254.2 ns |    265.8 ns | 0.0167 |      72 B |
| ToString                | -1.7976931348623157E+308 | ?      | ?       |    265.8 ns |   2.56 ns |   2.40 ns |    266.0 ns |    261.3 ns |    270.9 ns | 0.0171 |      72 B |
| ToStringWithFormat      | 12345                    | E      | ?       |    174.0 ns |   2.91 ns |   2.72 ns |    173.7 ns |    168.8 ns |    179.1 ns | 0.0113 |      48 B |
| ToStringWithFormat      | 12345                    | F50    | ?       |    486.4 ns |   5.71 ns |   5.34 ns |    486.0 ns |    479.0 ns |    497.3 ns | 0.0324 |     136 B |
| ToStringWithFormat      | 12345                    | G      | ?       |    130.1 ns |   1.86 ns |   1.74 ns |    130.7 ns |    126.0 ns |    132.0 ns | 0.0074 |      32 B |
| ToStringWithFormat      | 12345                    | G17    | ?       |    180.1 ns |   1.84 ns |   1.72 ns |    180.0 ns |    177.2 ns |    183.0 ns | 0.0070 |      32 B |
| ToStringWithFormat      | 12345                    | R      | ?       |    125.2 ns |   0.76 ns |   0.71 ns |    125.3 ns |    123.6 ns |    126.5 ns | 0.0076 |      32 B |
| ToStringWithCultureInfo | 12345                    | ?      | zh      |    124.0 ns |   1.28 ns |   1.20 ns |    123.9 ns |    122.8 ns |    127.0 ns | 0.0075 |      32 B |
| ToString                | 12345                    | ?      | ?       |    125.2 ns |   1.00 ns |   0.93 ns |    125.1 ns |    124.0 ns |    127.8 ns | 0.0076 |      32 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | E      | ?       |    171.5 ns |   1.35 ns |   1.26 ns |    171.2 ns |    169.5 ns |    174.6 ns | 0.0110 |      48 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | F50    | ?       | 16,602.1 ns | 136.96 ns | 128.12 ns | 16,572.9 ns | 16,435.3 ns | 16,886.5 ns | 0.1939 |     744 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | G      | ?       |    268.0 ns |  16.09 ns |  18.53 ns |    262.6 ns |    258.7 ns |    332.3 ns | 0.0166 |      72 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | G17    | ?       |    247.9 ns |   2.17 ns |   2.03 ns |    247.4 ns |    245.2 ns |    251.6 ns | 0.0169 |      72 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | R      | ?       |    261.4 ns |   3.42 ns |   3.20 ns |    261.2 ns |    257.5 ns |    267.9 ns | 0.0169 |      72 B |
| ToStringWithCultureInfo | 1.7976931348623157E+308  | ?      | zh      |    258.3 ns |   2.36 ns |   2.21 ns |    258.1 ns |    254.4 ns |    263.0 ns | 0.0165 |      72 B |
| ToString                | 1.7976931348623157E+308  | ?      | ?       |    261.1 ns |   1.56 ns |   1.45 ns |    261.1 ns |    257.2 ns |    263.6 ns | 0.0166 |      72 B |
| Method                  | value                    | format | culture | Mean         | Error      | StdDev     | Median       | Min          | Max          | Gen0   | Allocated |
|------------------------ |------------------------- |------- |-------- |-------------:|-----------:|-----------:|-------------:|-------------:|-------------:|-------:|----------:|
| ToStringWithFormat      | -1.7976931348623157E+308 | E      | ?       |    145.27 ns |   1.992 ns |   1.864 ns |    145.16 ns |    142.53 ns |    149.13 ns | 0.0129 |      56 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | F50    | ?       | 16,306.43 ns | 193.665 ns | 181.154 ns | 16,302.42 ns | 16,015.62 ns | 16,591.64 ns | 0.1941 |     744 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | G      | ?       |    240.99 ns |   2.381 ns |   2.228 ns |    240.78 ns |    237.48 ns |    246.65 ns | 0.0174 |      72 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | G17    | ?       |    228.99 ns |   1.918 ns |   1.794 ns |    228.77 ns |    224.88 ns |    231.88 ns | 0.0173 |      72 B |
| ToStringWithFormat      | -1.7976931348623157E+308 | R      | ?       |    240.31 ns |   1.715 ns |   1.604 ns |    240.50 ns |    238.19 ns |    243.45 ns | 0.0172 |      72 B |
| ToStringWithCultureInfo | -1.7976931348623157E+308 | ?      | zh      |    233.90 ns |   1.246 ns |   1.166 ns |    233.24 ns |    232.64 ns |    235.74 ns | 0.0168 |      72 B |
| ToString                | -1.7976931348623157E+308 | ?      | ?       |    235.46 ns |   1.461 ns |   1.366 ns |    235.62 ns |    233.16 ns |    237.81 ns | 0.0170 |      72 B |
| ToStringWithFormat      | 12345                    | E      | ?       |    138.85 ns |   1.330 ns |   1.245 ns |    138.59 ns |    137.37 ns |    142.61 ns | 0.0112 |      48 B |
| ToStringWithFormat      | 12345                    | F50    | ?       |    498.15 ns |   5.174 ns |   4.839 ns |    497.46 ns |    491.25 ns |    508.22 ns | 0.0325 |     136 B |
| ToStringWithFormat      | 12345                    | G      | ?       |     97.19 ns |   1.147 ns |   1.072 ns |     97.30 ns |     94.87 ns |     99.38 ns | 0.0075 |      32 B |
| ToStringWithFormat      | 12345                    | G17    | ?       |    140.31 ns |   1.938 ns |   1.813 ns |    139.93 ns |    137.99 ns |    144.45 ns | 0.0076 |      32 B |
| ToStringWithFormat      | 12345                    | R      | ?       |     94.63 ns |   1.437 ns |   1.344 ns |     94.68 ns |     92.69 ns |     96.52 ns | 0.0076 |      32 B |
| ToStringWithCultureInfo | 12345                    | ?      | zh      |     94.01 ns |   0.834 ns |   0.780 ns |     93.91 ns |     93.02 ns |     95.11 ns | 0.0075 |      32 B |
| ToString                | 12345                    | ?      | ?       |     92.90 ns |   1.219 ns |   1.140 ns |     93.02 ns |     91.46 ns |     95.43 ns | 0.0076 |      32 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | E      | ?       |    141.11 ns |   1.295 ns |   1.211 ns |    140.91 ns |    139.41 ns |    143.57 ns | 0.0113 |      48 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | F50    | ?       | 16,351.35 ns | 114.370 ns | 106.982 ns | 16,356.80 ns | 16,169.31 ns | 16,532.19 ns | 0.1937 |     744 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | G      | ?       |    227.34 ns |   2.033 ns |   1.901 ns |    228.01 ns |    222.82 ns |    229.60 ns | 0.0173 |      72 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | G17    | ?       |    217.83 ns |   1.664 ns |   1.556 ns |    217.88 ns |    214.59 ns |    220.59 ns | 0.0168 |      72 B |
| ToStringWithFormat      | 1.7976931348623157E+308  | R      | ?       |    232.44 ns |   2.088 ns |   1.953 ns |    232.74 ns |    227.63 ns |    235.68 ns | 0.0167 |      72 B |
| ToStringWithCultureInfo | 1.7976931348623157E+308  | ?      | zh      |    225.95 ns |   2.621 ns |   2.452 ns |    226.27 ns |    222.08 ns |    229.61 ns | 0.0166 |      72 B |
| ToString                | 1.7976931348623157E+308  | ?      | ?       |    227.49 ns |   2.401 ns |   2.245 ns |    227.78 ns |    222.42 ns |    231.75 ns | 0.0173 |      72 B |

Yes the regression can be confirmed.

@huoyaoyuan
Copy link
Member

Analysis: there are many leaf calls of TNumber.IsXXX etc. If mono fails to inline and devirtualize all of them, it can be a source of regression, especially for regular (small) precision.

@matouskozak
Copy link
Member

Thank you for investigating. It's definitely possible but I cannot tell with certainty if they get inlined/devirtualized from top of my head.

Another idea I got when looking at your PR was the change from using TryRunDouble/TryRunHalf to generic TryRun<TNumber>. In many scenarios, the Mono LLVM codegen fallbacks to Mono mini codegen (much less optimized) when encountered with generics so eventually causing worse code to be generated. (fyi: @steveisok)

@huoyaoyuan
Copy link
Member

It's closed to feature complete for .NET 9. Should we go with the regression, or revert the overloads for float and double? The generic version can be used for less-important types, including Half and BFloat16.

/cc @fanyang-mono

@fanyang-mono
Copy link
Member

@huoyaoyuan Before answering your question, I would like to understand the purpose of your PR. Was the goal to simplify the library code or to enhance CoreCLR performance? If it was to simplify the library code, I would say revert the change, as ~20% regression is quite significant.

Another thing that I don't understand was that on you PR, you did run the microbenchmarks for Mono AOT and didn't see significant change there. But here you were able to see the regression when you run them again. I wonder what you did differently?

@huoyaoyuan
Copy link
Member

Was the goal to simplify the library code or to enhance CoreCLR performance?

It was simplification. I did not want to duplicate all the methods for another time when adding BFloat16. I think it should be fine to use the generic version for Half and BFloat16 since they are less important.

Another thing that I don't understand was that on you PR, you did run the microbenchmarks for Mono AOT and didn't see significant change there. But here you were able to see the regression when you run them again. I wonder what you did differently?

The manual test was on Windows, and the script driven test was on WSL. Maybe the build on Windows is not clean and messed up.

@tannergooding
Copy link
Member

tannergooding commented Jul 17, 2024

I think in this case there's a small enough amount of code and this is a core enough API that duplicating it is "ok".

But notably this is just another scenario where there's a fairly substantial amount of code that needs to exist per supported T (around 200 lines) and the T we have to support is frequently 2 (byte and char), 3 (float, double, Half; potentially BFloat16 and others in the future), 10 (byte/sbyte, short/ushort, int/uint, long/ulong, nint/nuint), or even the full set of primitives. Generics are the solution to not only avoid this duplication, but also to increase maintainability and reduce the risk of bugs. There's many other scenarios where duplicating the code is not feasible or adds too much risk and so it is not always going to be a viable workaround to ensure that the Mono performance doesn't regress.

It would be great if we could take another look at what it would take to add even basic generic specialization support to Mono and try to prioritize that work as I expect it would have substantial improvements to the ecosystem and likely give Mono some reasonable size improvements on top of that. Even if the feature was scoped to only work for corelib using an internal [MonoSpecialization(typeof(...)] attribute so we could explicitly list the finite set of encounterable T, I think that would provide substantial wins and solve the more immediate problem.

CC. @stephentoub, @jeffhandley as an FYI

@fanyang-mono
Copy link
Member

fanyang-mono commented Jul 17, 2024

@huoyaoyuan It would be nice to get the PR of reverting merged before Preview7 snap (July 19), if possible.

@huoyaoyuan
Copy link
Member

I'll try to get code ready tomorrow (July 18 morning US time).

@jkotas
Copy link
Member

jkotas commented Jul 19, 2024

I would say that the 20% regression in floating number formatting microbenchmarks is acceptable. The scenarios targeted by Mono are very unlikely to be dominated by floating point number formatting. It is not a regression with significant customer impact.

The generic code should be smaller. I would expect that the change is a small improvement for IL-only binary size that is also important metric for scenarios targeted by Mono.

As Tanner pointed out, the patterns used by the offending PR is used in many places throughout the libraries. If Mono is not able to handle it well, it is a much bigger problem that floating point number formatting microbenchmarks. We should be looking at each case individually, but I do not think it makes sense to try to preserve unnaturally written code to prevent regressions at all costs.

It would be great if we could take another look at what it would take to add even basic generic specialization support to Mono

It may help here and there, but I do not think it would fix the structural problem that a lot of new code is leveraging advanced RyuJIT optimizations. I believe that we will need to eventually figure out how to leverage RyuJIT here. It is not feasible to replay the RyuJIT investments in Mono codegens.

@huoyaoyuan
Copy link
Member

huoyaoyuan commented Jul 19, 2024

As Tanner pointed out, the patterns used by the offending PR is used in many places throughout the libraries. If Mono is not able to handle it well, it is a much bigger problem that floating point number formatting microbenchmarks.

Another case is the auxiliary IUtfChar type for UTF-8/UTF-16 unification. It's used much more intensively. Microbenchmark shows that coreclr appears to handle IBinaryInteger well. If mono can also handle it equivalently, switching to public type will help types outside CoreLib, namely BigInteger.

It should be another problem of generic specialization pattern though.

@tannergooding
Copy link
Member

The generic code should be smaller. I would expect that the change is a small improvement for IL-only binary size that is also important metric for scenarios targeted by Mono.

I believe the issue (and would appreciate confirmation from @fanyang-mono or @matouskozak) is that this is hitting the general USG (Universal Shared Generics) paths for Mono and so it actually results in a non-trivial increase to size (see also #104952).

Due to the lack of specialization, the USG paths for value types on Mono also end up being slower and lead to these types of perf slowdowns. To my understanding, MonoAOT also doesn't utilize any logic similar to NativeAOT and so is unable to discern when a given generic will only ever be encountered by a finite set of types, this causes it to generate code to handle any value type rather than the finite set that will actually be encountered for typical scenarios.

The believe is then that having a way to propagate this information to Mono and have it take advantage of that would go a long way towards mitigating these types of regressions.

I believe that we will need to eventually figure out how to leverage RyuJIT here. It is not feasible to replay the RyuJIT investments in Mono codegens.

👍, it is definitely a significant amount of work to mirror optimizations between all the possible backends (RyuJIT, MonoJIT, MonoInterpreter, MonoLLVM, etc). The wins when we have mirrored key optimizations have typically been positive when done, however, and I'd imagine we want to continue targeting key scenarios for the foreseeable future (at least until we are at a point where an alternative is viable), so I'm mostly just trying to call out the places that I believe would have the biggest impact based on my understanding of the problem space.

@jkotas
Copy link
Member

jkotas commented Jul 19, 2024

I believe the issue (and would appreciate confirmation from @fanyang-mono or @matouskozak) is that this is hitting the general USG (Universal Shared Generics) paths for Mono

Yes, it would be good to confirm the root cause.

The USG fallbacks typically introduce much worse regressions than 20%. Given that the regression is relatively small, my assumption was that it is caused by inliner and optimizer limitations.

@jeffhandley jeffhandley added this to the 9.0.0 milestone Jul 20, 2024
@jeffhandley jeffhandley added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed untriaged New issue has not been triaged by the area owner labels Jul 20, 2024
@fanyang-mono
Copy link
Member

I will need to analyze to generated code closely to confirm that. Will report back later.

@fanyang-mono
Copy link
Member

fanyang-mono commented Aug 9, 2024

I conducted my investigation based on microbenchmark System.Tests.Perf_Single.ToStringWithFormat(value: 12345, format: "R"). It regressed 26% after @huoyaoyuan 's formatting PR.

The major regression caused by the JIT/code running time of the newly introduced methods highlighted on the right-hand side of the following screenshot.
Screenshot 2024-08-09 at 3 24 17 PM

Those methods are all generic methods which are not AOT'ed. Mono doesn't have tiered JIT. I am not sure what kind of optimizations we could add to Mono JIT to make these extra methods calls go away. @lambdageek Feel free to comment, if you have more thoughts on this.

@tannergooding
Copy link
Member

Some of the other functions in there are also fairly hot.

Cases like new Vector<byte>, SpanHelpers.Fill, Unsafe.Bitcast, etc. Many of them are intrinsic to Mono, so it's interesting that they are being left in as calls, right?

@matouskozak matouskozak changed the title [Perf] Linux/x64: 44 Regressions on 6/3/2024 6:35:27 PM [mono][Perf] Perf_Single and Perf_Double Regressions on 6/3/2024 6:35:27 PM Aug 15, 2024
@matouskozak matouskozak changed the title [mono][Perf] Perf_Single and Perf_Double Regressions on 6/3/2024 6:35:27 PM [mono][Perf] MonoAOT Perf_Single and Perf_Double Regressions on 6/3/2024 6:35:27 PM Aug 16, 2024
@jeffhandley jeffhandley added tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark labels Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 arch-x64 area-System.Numerics needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration os-linux Linux OS (any supported distro) runtime-mono specific to the Mono runtime tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants