Optimize `BigInteger.ToString` for large decimal string #112178

kzrnm · 2025-02-05T06:31:01Z

This PR is a counterpart to #55121. divide-and-conquer algorithm

Number.FormatBigInteger() can run in $D(n)log(N)$ time using the Divide and Conquer algorithm, where $D(n)$ represents the computational complexity of BigInteger division.

The computational complexity of division have been improved by #96895.

Benchmark

Code

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using System.Numerics;

[DisassemblyDiagnoser]
[GroupBenchmarksBy(BenchmarkLogicalGroupRule.ByMethod)]
public class ToStringTest
{
    [Params(100, 1000, 10000, 100000)]
    public int N;

    BigInteger b;

    [GlobalSetup]
    public void Setup()
    {
        b = BigInteger.Parse(new string('9', N));
    }

    [Benchmark] public string DecimalString() => b.ToString();
}


BenchmarkDotNet v0.13.12, Windows 11 (10.0.26100.3037)
13th Gen Intel Core i5-13500, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.100-alpha.1.25077.2
  [Host]   : .NET 10.0.0 (10.0.25.7313), X64 RyuJIT AVX2
  ShortRun : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX2

Job=ShortRun  IterationCount=3  LaunchCount=1  
WarmupCount=3

Method	Toolchain	N	Mean	Error	StdDev	Ratio	RatioSD	Code Size	Gen0	Gen1	Gen2	Allocated	Alloc Ratio
DecimalString	\main\corerun.exe	100	162.6 ns	72.78 ns	3.99 ns	1.00	0.00	3,976 B	0.0176	-	-	224 B	1.00
DecimalString	\pr\corerun.exe	100	162.2 ns	80.65 ns	4.42 ns	1.00	0.05	3,836 B	0.0176	-	-	224 B	1.00

DecimalString	\main\corerun.exe	1000	10,087.1 ns	1,262.45 ns	69.20 ns	1.00	0.00	3,801 B	0.1526	-	-	2024 B	1.00
DecimalString	\pr\corerun.exe	1000	6,181.8 ns	877.68 ns	48.11 ns	0.61	0.01	3,757 B	0.1602	-	-	2024 B	1.00

DecimalString	\main\corerun.exe	10000	1,042,496.9 ns	207,485.10 ns	11,372.96 ns	1.00	0.00	3,801 B	-	-	-	20026 B	1.00
DecimalString	\pr\corerun.exe	10000	336,682.2 ns	132,151.44 ns	7,243.67 ns	0.32	0.01	3,758 B	1.4648	-	-	20025 B	1.00

DecimalString	\main\corerun.exe	100000	100,113,761.1 ns	18,888,052.58 ns	1,035,317.90 ns	1.00	0.00	3,800 B	-	-	-	200155 B	1.00
DecimalString	\pr\corerun.exe	100000	11,684,286.5 ns	508,446.11 ns	27,869.65 ns	0.12	0.00	3,757 B	46.8750	46.8750	46.8750	200088 B	1.00

dotnet-policy-service · 2025-02-05T06:31:33Z

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

huoyaoyuan · 2025-02-05T06:39:51Z

src/libraries/System.Runtime.Numerics/src/System/Number.BigInteger.cs

-
-            int cuDst = 0;
+            // The Ratio is calculated as: log_{10^9}(2^32)
+            const double digitRatio = 1.0703288734719332;


Does it worth to do a FP calculation, instead of conservatively allocate for upper limit?

There's not much difference, but the required storage size will be smaller. For example, when the bit length is 58 or 59, stackalloc is used.

_bit.Length old new

57 64 62

58 65 63

59 66 64

This change also aims to achieve mathematical correctness in the implementation.

Either stack allocation or array poll will round the requested size up to whole number. Such difference should be really negligible.

When coming with FP arithmetic, it's also important to make sure rounding doesn't give you smaller integer result for corner cases. That's why I'm suggesting against FP.

I rounded up the value of digitRatio from $1.0703288734719332$, which is the return value of Math.Log(1L<<32, 1e9), to $1.070328873472$. Since double has higher precision than int, unintended rounding down has become theoretically impossible.

src/libraries/System.Runtime.Numerics/src/System/Number.BigInteger.cs

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (3)

src/libraries/System.Runtime.Numerics/src/System/Number.BigInteger.cs:797

[nitpick] The variable name 'digitRatio' is ambiguous. It should be renamed to something more descriptive like 'logBase2ToBase1E9Ratio'.

const double digitRatio = 1.070328873472;

src/libraries/System.Runtime.Numerics/src/System/Number.BigInteger.cs:933

[nitpick] The method name 'BigIntegerToBase1E9' could be more descriptive. Consider renaming it to 'ConvertBigIntegerToBase1E9'.

private static void BigIntegerToBase1E9(ReadOnlySpan<uint> bits, Span<uint> base1E9Buffer, out int leadingWritten)

src/libraries/System.Runtime.Numerics/tests/BigInteger/BigIntegerToStringTests.cs:13

The variable 'result' is initialized to 'null' but is not explicitly set in the 'catch' block, which could lead to a 'NullReferenceException'. Ensure 'result' is properly handled in the 'catch' block.

string result = null;

dotnet-issue-labeler bot added the area-System.Numerics label Feb 5, 2025

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Feb 5, 2025

huoyaoyuan reviewed Feb 5, 2025

View reviewed changes

EgorBo reviewed Feb 5, 2025

View reviewed changes

src/libraries/System.Runtime.Numerics/src/System/Number.BigInteger.cs Outdated Show resolved Hide resolved

EgorBo mentioned this pull request Feb 5, 2025

Expose a malloca API that either stackallocs or creates an array. #52065

Open

kzrnm force-pushed the BigIntegerStringDec branch from efcf4e1 to bf75784 Compare February 5, 2025 11:12

kzrnm mentioned this pull request Feb 11, 2025

Split FormatBigInteger into smaller parts. #112413

Open

kzrnm added 4 commits February 12, 2025 01:10

Add test

d91d02e

ToString by DivideAndConquer

fe45da6

ToStringTestThreshold

791b6ab

Fix base1E9Buffer

86f4a53

kzrnm force-pushed the BigIntegerStringDec branch from bf75784 to 86f4a53 Compare February 11, 2025 16:10

Rount up digitRatio

f825641

kzrnm force-pushed the BigIntegerStringDec branch from 2abfe84 to f825641 Compare February 12, 2025 19:13

danmoseley requested a review from Copilot February 14, 2025 03:49

Copilot AI reviewed Feb 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `BigInteger.ToString` for large decimal string #112178

Optimize `BigInteger.ToString` for large decimal string #112178

kzrnm commented Feb 5, 2025

dotnet-policy-service bot commented Feb 5, 2025

huoyaoyuan Feb 5, 2025

kzrnm Feb 5, 2025

huoyaoyuan Feb 11, 2025

huoyaoyuan Feb 11, 2025

kzrnm Feb 12, 2025

Optimize BigInteger.ToString for large decimal string #112178

Are you sure you want to change the base?

Optimize BigInteger.ToString for large decimal string #112178

Conversation

kzrnm commented Feb 5, 2025

Benchmark

dotnet-policy-service bot commented Feb 5, 2025

huoyaoyuan Feb 5, 2025

Choose a reason for hiding this comment

kzrnm Feb 5, 2025

Choose a reason for hiding this comment

huoyaoyuan Feb 11, 2025

Choose a reason for hiding this comment

huoyaoyuan Feb 11, 2025

Choose a reason for hiding this comment

kzrnm Feb 12, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Optimize `BigInteger.ToString` for large decimal string #112178

Optimize `BigInteger.ToString` for large decimal string #112178