Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate writing large string values in 4 KB chunks within Utf8JsonWriter instead all at once #29293

Closed
ahsonkhan opened this issue Apr 17, 2019 · 4 comments
Labels
area-System.Text.Json backlog-cleanup-candidate An inactive issue that has been marked for automated closure. enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors tenet-performance Performance related issue
Milestone

Comments

@ahsonkhan
Copy link
Member

Regardless of the mode (IBufferWriter or stream), consider writing JSON string values (whether UTF-16 or UTF-8) in chunks of fixed size (for example 4_096). Currently, the writer pessimistically asks for as much space as could be required in the worst case, and tries to write the whole string in one shot. It would reduce allocation/GC pressure if we wrote in smaller chunks.

For the stream mode, we should also consider putting a size cap ~ 64k to avoid the LOH.

Note: This only applies to JSON string values. Property names and other tokens (like numbers, literals, datetime/etc.) all tend to generally be small enough to not benefit from this optimization.

For example, something like the following, where we enter the "slow/loop" path for strings larger than 4K:

private void WriteStringMinimized(ReadOnlySpan<char> escapedValue)
{
    Debug.Assert(escapedValue.Length < (int.MaxValue / JsonConstants.MaxExpansionFactorWhileTranscoding) - 3);

    if (escapedValue.Length > 4_000)
    {
        WriteStringMinimizedSlow(escapedValue);
        return;
    }

    // Otherwise, fall down the normal/fast code path
}

private void WriteStringMinimizedSlow(ReadOnlySpan<char> escapedValue)
{
    Debug.Assert(escapedValue.Length > 4_000);
    Debug.Assert(escapedValue.Length < (int.MaxValue / JsonConstants.MaxExpansionFactorWhileTranscoding) - 3);

    if (_memory.Length - BytesPending < escapedValue.Length)
    {
        Grow(2);
    }

    Span<byte> output = _memory.Span;

    if (_currentDepth < 0)
    {
        output[BytesPending++] = JsonConstants.ListSeparator;
    }
    output[BytesPending++] = JsonConstants.Quote;

    TranscodeAndWriteLoop(escapedValue, output.Slice(BytesPending));
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void TranscodeAndWriteLoop(ReadOnlySpan<char> escapedPropertyName, Span<byte> output)
{
    ReadOnlySpan<byte> byteSpan = MemoryMarshal.AsBytes(escapedPropertyName);
    int partialConsumed = 0;
    while (true)
    {
        OperationStatus status = JsonWriterHelper.ToUtf8(byteSpan.Slice(partialConsumed), output, out int consumed, out int written);
        BytesPending += written;
        if (status == OperationStatus.Done)
        {
            break;
        }
        partialConsumed += consumed;
        Span<byte> temp = GrowAndReturn();
        output = temp;
    }

    if (output.Length <= BytesPending)
    {
        Span<byte> temp = GrowAndReturn();
        output = temp;
    }

    output[BytesPending++] = JsonConstants.Quote;
}

private Span<byte> GrowAndReturn()
{
    FlushHelper();

    if (_stream != null)
    {
        FlushHelperStream();
        _memory = _output.GetMemory(DefaultGrowthSize);

        Debug.Assert(_memory.Length >= DefaultGrowthSize);
    }
    else
    {
        _memory = _output.GetMemory(DefaultGrowthSize);

        if (_memory.Length < DefaultGrowthSize)
        {
            ThrowHelper.ThrowArgumentException_NeedLargerSpan();
        }
    }

    BytesCommitted += BytesPending;
    BytesPending = 0;

    return _memory.Span;
}

See related PR: dotnet/corefx#36961

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the Future milestone Feb 1, 2020
@marcusturewicz
Copy link
Contributor

marcusturewicz commented Apr 11, 2020

I've had a crack at this with the following benchmarks:

using System;
using System.Buffers;
using System.IO;
using System.Text;
using System.Text.Json2;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Jobs;

namespace Utf8JsonWriter4KChunking
{
    [MemoryDiagnoser]
    public class Benchmarks
    {
        private char[] _string;

        [Params(100, 1_000, 2_000, 4_000, 8_000, 16_000)]
        public int N;

        [GlobalSetup]
        public void Setup()
        {
            _string = GetString(N);
        }

        public static char[] GetString(int length)
        {
            var sb = new StringBuilder(length);
            for (int i = 0; i < length; i++)
            {
                sb.Append("A");
            }

            return sb.ToString().ToCharArray();
        }        

        [Benchmark]
        public void WriteStringMinimized()
        {
            using (var writer = new Utf8JsonWriter(new ArrayBufferWriter<byte>()))
            {
                writer.WriteStringMinimized(_string.AsSpan());
            }
        }  

        [Benchmark]
        public void WriteStringMinimizedChunked()
        {
            using (var writer = new Utf8JsonWriter(new ArrayBufferWriter<byte>()))
            {
                writer.WriteStringMinimizedChunked(_string.AsSpan());
            }
        }                   
    }
}

which produces:

Method N Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
WriteStringMinimized 100 159.3 ns 0.42 ns 0.37 ns 0.2294 - - 480 B
WriteStringMinimizedChunked 100 162.6 ns 1.67 ns 1.48 ns 0.2294 - - 480 B
WriteStringMinimized 1000 643.8 ns 3.03 ns 2.36 ns 1.5221 - - 3184 B
WriteStringMinimizedChunked 1000 808.4 ns 7.13 ns 6.67 ns 1.5221 - - 3184 B
WriteStringMinimized 2000 1,500.3 ns 11.13 ns 9.86 ns 2.9488 - - 6184 B
WriteStringMinimizedChunked 2000 1,218.7 ns 23.97 ns 21.25 ns 2.9488 - - 6184 B
WriteStringMinimized 4000 2,285.8 ns 25.06 ns 23.44 ns 5.8136 - - 12184 B
WriteStringMinimizedChunked 4000 2,276.7 ns 25.42 ns 22.53 ns 5.8136 - - 12184 B
WriteStringMinimized 8000 5,574.5 ns 58.86 ns 55.05 ns 11.4899 - - 24184 B
WriteStringMinimizedChunked 8000 4,614.0 ns 20.01 ns 17.74 ns 2.2964 - - 4808 B
WriteStringMinimized 16000 10,803.2 ns 149.54 ns 124.87 ns 22.7203 - - 48184 B
WriteStringMinimizedChunked 16000 6,365.8 ns 39.64 ns 33.10 ns 2.2964 - - 4808 B

@marcusturewicz
Copy link
Contributor

I realised there was already a benchmark for this in dotnet/performance, so I have gone ahead and run the changes against that:

BenchmarkDotNet=v0.12.1, OS=macOS Mojave 10.14.6 (18G2022) [Darwin 18.7.0]
Intel Core i5-6360U CPU 2.00GHz (Skylake), 1 CPU, 4 logical and 2 physical cores
.NET Core SDK=5.0.100-preview.4.20214.21
  [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.21403, CoreFX 5.0.20.21403), X64 RyuJIT
  Job-JUSUMR : .NET Core 5.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT
  Job-FYQBCX : .NET Core 5.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable  IterationTime=250.0000 ms  
MaxIterationCount=20  MinIterationCount=15  WarmupCount=1  
Method Job Toolchain Formatted SkipValidation Escaped Mean Error StdDev Median Min Max Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
WriteStringsUtf8 Job-JUSUMR master False False AllEscaped 82.595 ms 2.1276 ms 2.1849 ms 82.101 ms 79.880 ms 88.025 ms 0.98 0.06 - - - 1752 B
WriteStringsUtf8 Job-FYQBCX 4KChunking False False AllEscaped 83.927 ms 3.7998 ms 4.0657 ms 82.820 ms 79.832 ms 93.181 ms 1.00 0.00 - - - 1752 B
WriteStringsUtf16 Job-JUSUMR master False False AllEscaped 89.948 ms 1.9219 ms 2.0564 ms 89.485 ms 87.539 ms 94.044 ms 0.99 0.02 - - - 3240 B
WriteStringsUtf16 Job-FYQBCX 4KChunking False False AllEscaped 90.920 ms 1.5026 ms 1.4758 ms 90.413 ms 89.479 ms 95.283 ms 1.00 0.00 - - - 3384 B
WriteStringsUtf8 Job-JUSUMR master False False OneEscaped 12.722 ms 0.1957 ms 0.1528 ms 12.734 ms 12.493 ms 12.999 ms 0.87 0.07 - - - 250 B
WriteStringsUtf8 Job-FYQBCX 4KChunking False False OneEscaped 14.614 ms 1.1197 ms 1.2894 ms 14.208 ms 13.082 ms 17.119 ms 1.00 0.00 - - - 203 B
WriteStringsUtf16 Job-JUSUMR master False False OneEscaped 17.775 ms 0.6056 ms 0.6219 ms 17.889 ms 16.816 ms 18.941 ms 0.96 0.03 - - - 477 B
WriteStringsUtf16 Job-FYQBCX 4KChunking False False OneEscaped 18.510 ms 0.3514 ms 0.3608 ms 18.504 ms 17.969 ms 19.342 ms 1.00 0.00 - - - 367 B
WriteStringsUtf8 Job-JUSUMR master False False NoneEscaped 6.949 ms 0.4184 ms 0.4818 ms 6.751 ms 6.424 ms 7.998 ms 1.02 0.07 - - - 139 B
WriteStringsUtf8 Job-FYQBCX 4KChunking False False NoneEscaped 6.823 ms 0.1857 ms 0.2138 ms 6.924 ms 6.459 ms 7.138 ms 1.00 0.00 - - - 185 B
WriteStringsUtf16 Job-JUSUMR master False False NoneEscaped 11.645 ms 0.5281 ms 0.5187 ms 11.603 ms 11.060 ms 13.144 ms 1.05 0.05 - - - 193 B
WriteStringsUtf16 Job-FYQBCX 4KChunking False False NoneEscaped 11.104 ms 0.1740 ms 0.1628 ms 11.146 ms 10.842 ms 11.378 ms 1.00 0.00 - - - 238 B
WriteStringsUtf8 Job-JUSUMR master False True AllEscaped 80.760 ms 1.8759 ms 2.1602 ms 79.577 ms 78.584 ms 86.199 ms 1.01 0.04 - - - 120 B
WriteStringsUtf8 Job-FYQBCX 4KChunking False True AllEscaped 80.208 ms 1.5761 ms 1.6865 ms 80.012 ms 78.473 ms 84.093 ms 1.00 0.00 - - - 120 B
WriteStringsUtf16 Job-JUSUMR master False True AllEscaped 94.870 ms 2.4977 ms 2.7762 ms 94.309 ms 91.573 ms 101.268 ms 1.04 0.08 - - - 120 B
WriteStringsUtf16 Job-FYQBCX 4KChunking False True AllEscaped 91.491 ms 5.3954 ms 6.2133 ms 88.524 ms 86.252 ms 103.166 ms 1.00 0.00 - - - 120 B
WriteStringsUtf8 Job-JUSUMR master False True OneEscaped 12.981 ms 0.2388 ms 0.2233 ms 12.853 ms 12.733 ms 13.397 ms 1.01 0.04 - - - 120 B
WriteStringsUtf8 Job-FYQBCX 4KChunking False True OneEscaped 12.794 ms 0.3754 ms 0.4323 ms 12.656 ms 12.305 ms 13.803 ms 1.00 0.00 - - - 120 B
WriteStringsUtf16 Job-JUSUMR master False True OneEscaped 17.333 ms 0.2681 ms 0.2508 ms 17.328 ms 16.942 ms 17.783 ms 1.00 0.01 - - - 120 B
WriteStringsUtf16 Job-FYQBCX 4KChunking False True OneEscaped 17.343 ms 0.2365 ms 0.2212 ms 17.295 ms 16.985 ms 17.696 ms 1.00 0.00 - - - 120 B
WriteStringsUtf8 Job-JUSUMR master False True NoneEscaped 6.489 ms 0.1055 ms 0.0987 ms 6.447 ms 6.341 ms 6.633 ms 1.06 0.02 - - - 142 B
WriteStringsUtf8 Job-FYQBCX 4KChunking False True NoneEscaped 6.111 ms 0.1073 ms 0.0951 ms 6.074 ms 5.998 ms 6.315 ms 1.00 0.00 - - - 151 B
WriteStringsUtf16 Job-JUSUMR master False True NoneEscaped 11.366 ms 0.2191 ms 0.2344 ms 11.272 ms 11.095 ms 11.731 ms 0.99 0.04 - - - 120 B
WriteStringsUtf16 Job-FYQBCX 4KChunking False True NoneEscaped 11.460 ms 0.2372 ms 0.2732 ms 11.302 ms 11.173 ms 12.001 ms 1.00 0.00 - - - 120 B
WriteStringsUtf8 Job-JUSUMR master True False AllEscaped 81.640 ms 1.4894 ms 1.3932 ms 81.665 ms 79.870 ms 83.791 ms 1.03 0.02 - - - 120 B
WriteStringsUtf8 Job-FYQBCX 4KChunking True False AllEscaped 79.730 ms 1.5723 ms 1.3130 ms 78.756 ms 78.503 ms 82.065 ms 1.00 0.00 - - - 120 B
WriteStringsUtf16 Job-JUSUMR master True False AllEscaped 92.631 ms 1.8256 ms 2.0291 ms 91.998 ms 90.527 ms 96.764 ms 1.03 0.03 - - - 120 B
WriteStringsUtf16 Job-FYQBCX 4KChunking True False AllEscaped 90.088 ms 1.7202 ms 1.9119 ms 89.899 ms 87.870 ms 93.800 ms 1.00 0.00 - - - 120 B
WriteStringsUtf8 Job-JUSUMR master True False OneEscaped 14.378 ms 0.2350 ms 0.1962 ms 14.349 ms 14.111 ms 14.690 ms 1.05 0.04 - - - 120 B
WriteStringsUtf8 Job-FYQBCX 4KChunking True False OneEscaped 13.655 ms 0.3386 ms 0.3763 ms 13.408 ms 13.260 ms 14.526 ms 1.00 0.00 - - - 120 B
WriteStringsUtf16 Job-JUSUMR master True False OneEscaped 19.104 ms 0.1891 ms 0.1676 ms 19.108 ms 18.801 ms 19.414 ms 1.02 0.03 - - - 120 B
WriteStringsUtf16 Job-FYQBCX 4KChunking True False OneEscaped 18.686 ms 0.3624 ms 0.4173 ms 18.649 ms 18.173 ms 19.572 ms 1.00 0.00 - - - 120 B
WriteStringsUtf8 Job-JUSUMR master True False NoneEscaped 7.410 ms 0.1150 ms 0.1075 ms 7.403 ms 7.243 ms 7.572 ms 1.02 0.01 - - - 120 B
WriteStringsUtf8 Job-FYQBCX 4KChunking True False NoneEscaped 7.240 ms 0.1040 ms 0.0972 ms 7.250 ms 7.097 ms 7.425 ms 1.00 0.00 - - - 120 B
WriteStringsUtf16 Job-JUSUMR master True False NoneEscaped 12.553 ms 0.1904 ms 0.1688 ms 12.550 ms 12.356 ms 12.859 ms 1.01 0.03 - - - 197 B
WriteStringsUtf16 Job-FYQBCX 4KChunking True False NoneEscaped 12.370 ms 0.2344 ms 0.2192 ms 12.381 ms 12.035 ms 12.783 ms 1.00 0.00 - - - 183 B
WriteStringsUtf8 Job-JUSUMR master True True AllEscaped 80.800 ms 1.4158 ms 1.2551 ms 80.850 ms 79.130 ms 83.471 ms 0.98 0.02 - - - 120 B
WriteStringsUtf8 Job-FYQBCX 4KChunking True True AllEscaped 82.739 ms 1.4673 ms 1.3725 ms 82.266 ms 81.372 ms 86.250 ms 1.00 0.00 - - - 120 B
WriteStringsUtf16 Job-JUSUMR master True True AllEscaped 94.086 ms 1.3232 ms 1.1730 ms 94.070 ms 92.381 ms 95.909 ms 1.02 0.03 - - - 120 B
WriteStringsUtf16 Job-FYQBCX 4KChunking True True AllEscaped 91.992 ms 1.8990 ms 2.1107 ms 91.543 ms 89.228 ms 96.500 ms 1.00 0.00 - - - 120 B
WriteStringsUtf8 Job-JUSUMR master True True OneEscaped 13.610 ms 0.2605 ms 0.2676 ms 13.529 ms 13.258 ms 14.101 ms 0.99 0.02 - - - 120 B
WriteStringsUtf8 Job-FYQBCX 4KChunking True True OneEscaped 13.715 ms 0.1953 ms 0.1827 ms 13.759 ms 13.447 ms 14.115 ms 1.00 0.00 - - - 174 B
WriteStringsUtf16 Job-JUSUMR master True True OneEscaped 19.690 ms 0.3017 ms 0.2822 ms 19.637 ms 19.322 ms 20.237 ms 1.07 0.03 - - - 194 B
WriteStringsUtf16 Job-FYQBCX 4KChunking True True OneEscaped 18.354 ms 0.3067 ms 0.2869 ms 18.353 ms 17.973 ms 18.995 ms 1.00 0.00 - - - 120 B
WriteStringsUtf8 Job-JUSUMR master True True NoneEscaped 7.226 ms 0.1334 ms 0.1248 ms 7.256 ms 7.011 ms 7.402 ms 1.01 0.02 - - - 120 B
WriteStringsUtf8 Job-FYQBCX 4KChunking True True NoneEscaped 7.179 ms 0.0884 ms 0.0827 ms 7.191 ms 7.053 ms 7.319 ms 1.00 0.00 - - - 120 B
WriteStringsUtf16 Job-JUSUMR master True True NoneEscaped 12.351 ms 0.3524 ms 0.3619 ms 12.296 ms 11.929 ms 13.325 ms 1.00 0.03 - - - 252 B
WriteStringsUtf16 Job-FYQBCX 4KChunking True True NoneEscaped 12.333 ms 0.1818 ms 0.1701 ms 12.382 ms 12.103 ms 12.594 ms 1.00 0.00 - - - 193 B

@ghost
Copy link

ghost commented Oct 15, 2021

Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process.

This process is part of the experimental issue cleanup initiative we are currently trialing in a limited number of areas. Please share any feedback you might have in the linked issue.

@ghost
Copy link

ghost commented Nov 5, 2021

This issue will now be closed since it had been marked no recent activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the issue, but please note that the issue will be locked if it remains inactive for another 30 days.

@ghost ghost closed this as completed Nov 5, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Dec 5, 2021
@eiriktsarpalis eiriktsarpalis added the backlog-cleanup-candidate An inactive issue that has been marked for automated closure. label Feb 18, 2022
@ghost ghost removed the no-recent-activity label Feb 18, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Text.Json backlog-cleanup-candidate An inactive issue that has been marked for automated closure. enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

4 participants