Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize BigInteger.ToString for large decimal string #112178

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

kzrnm
Copy link
Contributor

@kzrnm kzrnm commented Feb 5, 2025

This PR is a counterpart to #55121. divide-and-conquer algorithm

Number.FormatBigInteger() can run in $D(n)log(N)$ time using the Divide and Conquer algorithm, where $D(n)$ represents the computational complexity of BigInteger division.

The computational complexity of division have been improved by #96895.


Benchmark

Code
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using System.Numerics;

[DisassemblyDiagnoser]
[GroupBenchmarksBy(BenchmarkLogicalGroupRule.ByMethod)]
public class ToStringTest
{
    [Params(100, 1000, 10000, 100000)]
    public int N;

    BigInteger b;

    [GlobalSetup]
    public void Setup()
    {
        b = BigInteger.Parse(new string('9', N));
    }

    [Benchmark] public string DecimalString() => b.ToString();
}

BenchmarkDotNet v0.13.12, Windows 11 (10.0.26100.3037)
13th Gen Intel Core i5-13500, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.100-alpha.1.25077.2
  [Host]   : .NET 10.0.0 (10.0.25.7313), X64 RyuJIT AVX2
  ShortRun : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX2

Job=ShortRun  IterationCount=3  LaunchCount=1  
WarmupCount=3  

Method Toolchain N Mean Error StdDev Ratio RatioSD Code Size Gen0 Gen1 Gen2 Allocated Alloc Ratio
DecimalString \main\corerun.exe 100 162.6 ns 72.78 ns 3.99 ns 1.00 0.00 3,976 B 0.0176 - - 224 B 1.00
DecimalString \pr\corerun.exe 100 162.2 ns 80.65 ns 4.42 ns 1.00 0.05 3,836 B 0.0176 - - 224 B 1.00
DecimalString \main\corerun.exe 1000 10,087.1 ns 1,262.45 ns 69.20 ns 1.00 0.00 3,801 B 0.1526 - - 2024 B 1.00
DecimalString \pr\corerun.exe 1000 6,181.8 ns 877.68 ns 48.11 ns 0.61 0.01 3,757 B 0.1602 - - 2024 B 1.00
DecimalString \main\corerun.exe 10000 1,042,496.9 ns 207,485.10 ns 11,372.96 ns 1.00 0.00 3,801 B - - - 20026 B 1.00
DecimalString \pr\corerun.exe 10000 336,682.2 ns 132,151.44 ns 7,243.67 ns 0.32 0.01 3,758 B 1.4648 - - 20025 B 1.00
DecimalString \main\corerun.exe 100000 100,113,761.1 ns 18,888,052.58 ns 1,035,317.90 ns 1.00 0.00 3,800 B - - - 200155 B 1.00
DecimalString \pr\corerun.exe 100000 11,684,286.5 ns 508,446.11 ns 27,869.65 ns 0.12 0.00 3,757 B 46.8750 46.8750 46.8750 200088 B 1.00

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Feb 5, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.


int cuDst = 0;
// The Ratio is calculated as: log_{10^9}(2^32)
const double digitRatio = 1.0703288734719332;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it worth to do a FP calculation, instead of conservatively allocate for upper limit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's not much difference, but the required storage size will be smaller. For example, when the bit length is 58 or 59, stackalloc is used.

_bit.Length old new
57 64 62
58 65 63
59 66 64

This change also aims to achieve mathematical correctness in the implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either stack allocation or array poll will round the requested size up to whole number. Such difference should be really negligible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When coming with FP arithmetic, it's also important to make sure rounding doesn't give you smaller integer result for corner cases. That's why I'm suggesting against FP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rounded up the value of digitRatio from $1.0703288734719332$, which is the return value of Math.Log(1L<<32, 1e9), to $1.070328873472$. Since double has higher precision than int, unintended rounding down has become theoretically impossible.

@kzrnm kzrnm force-pushed the BigIntegerStringDec branch from 2abfe84 to f825641 Compare February 12, 2025 19:13
@danmoseley danmoseley requested a review from Copilot February 14, 2025 03:49

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (3)

src/libraries/System.Runtime.Numerics/src/System/Number.BigInteger.cs:797

  • [nitpick] The variable name 'digitRatio' is ambiguous. It should be renamed to something more descriptive like 'logBase2ToBase1E9Ratio'.
const double digitRatio = 1.070328873472;

src/libraries/System.Runtime.Numerics/src/System/Number.BigInteger.cs:933

  • [nitpick] The method name 'BigIntegerToBase1E9' could be more descriptive. Consider renaming it to 'ConvertBigIntegerToBase1E9'.
private static void BigIntegerToBase1E9(ReadOnlySpan<uint> bits, Span<uint> base1E9Buffer, out int leadingWritten)

src/libraries/System.Runtime.Numerics/tests/BigInteger/BigIntegerToStringTests.cs:13

  • The variable 'result' is initialized to 'null' but is not explicitly set in the 'catch' block, which could lead to a 'NullReferenceException'. Ensure 'result' is properly handled in the 'catch' block.
string result = null;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Numerics community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants