-
Notifications
You must be signed in to change notification settings - Fork 261
Closed
Description
The method Task<ReadOnlyMemory<byte>> ReadFullAsync(Stream data, int currentPartSize) has a number of performance issues.
- Allocates too much memory: with large files, the part size will be 16MB. Depending on the Stream's implementation and data availability, this method can allocate a lot of memory due to creating buffers in a loop.
- Makes too many copies of the read data
- Copies read data byte-by-byte.
minio-dotnet/Minio/ApiEndpoints/ObjectOperations.cs
Lines 1127 to 1149 in 8daf3d8
| internal async Task<ReadOnlyMemory<byte>> ReadFullAsync(Stream data, int currentPartSize) | |
| { | |
| Memory<byte> result = new byte[currentPartSize]; | |
| var totalRead = 0; | |
| while (totalRead < currentPartSize) | |
| { | |
| Memory<byte> curData = new byte[currentPartSize - totalRead]; | |
| var curRead = await data.ReadAsync(curData[..(currentPartSize - totalRead)]).ConfigureAwait(false); | |
| if (curRead == 0) break; | |
| for (var i = 0; i < curRead; i++) | |
| curData.Slice(i, 1).CopyTo(result[(totalRead + i)..]); | |
| totalRead += curRead; | |
| } | |
| if (totalRead == 0) return null; | |
| if (totalRead == currentPartSize) return result; | |
| Memory<byte> truncatedResult = new byte[totalRead]; | |
| for (var i = 0; i < totalRead; i++) | |
| result.Slice(i, 1).CopyTo(truncatedResult[i..]); | |
| return truncatedResult; | |
| } |
I have will be submitting a PR in the coming days. I created a benchmark comparing my version with the current one. I will provide the source to this benchmark as well. Here is the updated version I am using in my benchmark,
private async Task<ReadOnlyMemory<byte>> ReadFullAsync2(Stream data, int currentPartSize)
{
Memory<byte> result = new byte[currentPartSize];
var totalRead = 0;
while (totalRead < currentPartSize)
{
var curData = result[totalRead..currentPartSize];
var curRead = await data.ReadAsync(curData).ConfigureAwait(false);
if (curRead == 0) break;
totalRead += curRead;
}
if (totalRead == 0) return null;
// Return only the valid portion without allocating a new buffer
return result[..totalRead];
}In my benchmark, I am using a virtual random data stream
internal class VirtualStream : Stream
{
private long position;
private readonly long length;
public VirtualStream(long length)
{
position = 0;
this.length = length;
}
public override bool CanRead => true;
public override bool CanSeek => false;
public override bool CanWrite => false;
public override long Length => length;
public override long Position
{
get => position;
set => position = (int)value;
}
public override void Flush() { }
public override int Read(byte[] buffer, int offset, int count)
{
// Calculate how many bytes are left to read
var remaining = length - position;
if (remaining <= 0)
return 0; // End of stream
// Limit the read to a maximum of 4096 bytes per call
var maxRead = Math.Min(4096, count);
var toRead = (int)Math.Min(maxRead, remaining);
Random.Shared.NextBytes(buffer.AsSpan(offset, toRead));
position += toRead;
return toRead;
}
public override void SetLength(long value)
{
#pragma warning disable MA0025
throw new NotSupportedException();
#pragma warning restore MA0025
}
public override void Write(byte[] buffer, int offset, int count)
{
#pragma warning disable MA0025
throw new NotSupportedException();
#pragma warning restore MA0025
}
public override long Seek(long offset, SeekOrigin origin)
{
#pragma warning disable MA0025
throw new NotImplementedException();
#pragma warning restore MA0025
}
}When running the benchmarks for varying part sizes, you can see in the worse case senario with a part size of 16MB, 333x times longer, allocates 1000x more data and GC's are through the roof.
| Method | PartSize | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ReadFullAsync_Benchmark | 16777216 | 3,711,505.3 us | 60,184.75 us | 50,256.99 us | 3,705,554.8 us | 1.000 | 0.02 | 457000.0000 | 457000.0000 | 457000.0000 | 33581040.77 KB | 1.000 |
| ReadFullAsync2_Benchmark | 16777216 | 9,830.6 us | 188.57 us | 185.20 us | 9,776.0 us | 0.003 | 0.00 | - | - | - | 16833.01 KB | 0.001 |
| ReadFullAsync_Benchmark | 65536 | 700.4 us | 13.95 us | 37.71 us | 685.6 us | 1.00 | 0.07 | - | - | - | 611.15 KB | 1.00 |
| ReadFullAsync2_Benchmark | 65536 | 103.2 us | 6.56 us | 19.13 us | 106.0 us | 0.15 | 0.03 | - | - | - | 66.76 KB | 0.11 |
| ReadFullAsync_Benchmark | 1048576 | 19,956.5 us | 615.71 us | 1,776.47 us | 19,487.5 us | 1.01 | 0.12 | 29000.0000 | 29000.0000 | 29000.0000 | 132834.36 KB | 1.000 |
| ReadFullAsync2_Benchmark | 1048576 | 847.5 us | 29.89 us | 88.13 us | 839.3 us | 0.04 | 0.01 | - | - | - | 1053.01 KB | 0.008 |
Metadata
Metadata
Assignees
Labels
No labels