Unmanaged pooling MemoryAllocator #1730

antonfirsov · 2021-08-08T18:12:48Z

UPDATE 2: Ready for review!

Although later than used to, but 32 bit still fails with OutOfMemoryException. Consider retrying Marshal.AllocHGlobal on OutOfMemoryException after a short wait. DONE: We are blocking the thread on OOM to retry allocations. 32 bit is 2x slower with 20 Threads than 64 bit, but doesn't OOM. The retries alone are not responsible for the 2x slowdown, 32bit runtime seems to work 1.5x slower also with 10 threads, when there are no OOMs.
Implement PreferContiguousImageBuffers, remove MemoryAllocator.MinimumContiguousBlockSizeBytes.
Implement PixelAccessor<T> and other Pixel processing breaking changes & API discussion #1739 stuff
Remove ArrayPoolMemoryAllocator
Implement some other minor API changes, eg. MemoryAllocator.Default. -- Need to change MemoryAllocator.Default, it should be get-only.
Fix memory corruption / resource leak bugs

Prerequisites

I have written a descriptive pull-request title
I have verified that there are no overlapping pull-requests open
I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
I have provided test coverage for my change (where applicable)

Description

This PR introduces UniformUnmanagedMemoryPoolMemoryAllocator and sets it as default to fix #1596.

UniformUnmanagedMemoryPoolMemoryAllocator functional characteristics

For allocations below 1MB use ArrayPool<byte>.Shared
For allocations above 1MB use UniformUnmanagedMemoryPool to allocate 4 MB blocks of discontiguous unmanaged memory, of up to the pool's limit
Over pool limit, allocate unmanaged buffers split to discontiguous blocks of 32 MB. These will be freed immediately

Pool size

According to my benchmaks, the pool should scale to the maximum desired size to achieve the best througput. There is no point placing an artificial pool limit, unless there is a physical limitation. I decided to set the maximum pool size to 1/8 th of the available physical memory in 64 bit .NET Core processes. This means that on a 16GB machine the pool can grow as large as 2 GB.

On 32 bit, and other (non-testable) platforms the pool limit is 128 MB.

Trimming

The trimming of the pools is triggered by both Gen 2 GC collect and a timer. (We need the timer since unmanaged allocations don't trigger GC) On high load we trim the entire pool, on low load we trim 50% of the pool every minute.

Finalizers

With ArrayPoolMemoryAllocator, if an image is GC-d without being disposed, buffers are never returned to the pool. This means no hard memory leak, but the pools will be eventually exhausted, because the bucket's running index hitting the bucket limit.

To avoid this, MemoryGroup<T>.Owned and UniformUnmanagedMemoryPool.FinalizableBuffer<T> have finalizers returning the UnmanagedMemoryHandle to the pool. This can get tricky, since UnmanagedMemoryHandle is also finalizable:

ImageSharp/src/ImageSharp/Memory/Allocators/Internals/UnmanagedMemoryHandle.cs

Lines 58 to 78 in 1a41aaa

    
           /// <summary> 
        
           /// UnmanagedMemoryHandle's finalizer would release the underlying handle returning the memory to the OS. 
        
           /// We want to prevent this when a finalizable owner (buffer or MemoryGroup) is returning the handle to 
        
           /// <see cref="UniformUnmanagedMemoryPool"/> in it's finalizer. 
        
           /// Since UnmanagedMemoryHandle is CriticalFinalizable, it is guaranteed that the owner's finalizer is called first. 
        
           /// </summary> 
        
           internal void Resurrect() 
        
           { 
        
               GC.SuppressFinalize(this); 
        
               this.resurrected = true; 
        
           } 
        
           internal void AssignedToNewOwner() 
        
           { 
        
               if (this.resurrected) 
        
               { 
        
                   // The handle has been resurrected 
        
                   GC.ReRegisterForFinalize(this); 
        
                   this.resurrected = false; 
        
               } 
        
           }

I'm moderately concerned about CA2015, but I don't think it applies to us. Dispose will also free the memory used by a span. Touching a span or a pointer to SkiaSharp image's memory would be also a bug if the image is finalized.

API changes

Resolves #1739
Fixes #1675

Edit: API changes implemented according to #1739.

Benchmarking methodology

To determine these defaults I compared results of LoadResizeSaveParallelMemoryStress runs systematically, typically running them for a varying parameter a couple of times, while fixing all other parameters. I have a bunch of Excel documents comparing the tables, including all of them would be TLDR, but I can present information on request.

Benchmark results

I was benchmarking on a 10 Core (20 Thread) 19-10900X with 64 GB RAM. This means I was able to stress highly parallel workload very extensive allocation pressure.

Here is an median processing time (seconds) of 40 runs of LoadResizeSaveParallelMemoryStress ("Classic" means ArrayPoolMemoryAllocator):

Classic	Nopool	Unmanaged Pool
9.038	8.988	8.306
100.56%	100.00%	92.41%

ImageSharp is about 8% faster with the new default memory allocator.

Results of LoadResizeSaveStressBenchmarks BDN benchmark also show 7.5% improvement:

Method	maxDegreeOfParallelism	Mean	Error	StdDev	Ratio	Gen 0	Gen 1	Gen 2	Allocated
SystemDrawing	20	3,007.7 ms	151.1 ms	8.28 ms	1.00	-	-	-	30 KB
ImageSharp_DefaultMemoryAllocator	20	682.6 ms	395.6 ms	21.69 ms	0.23	-	-	-	2,413 KB
ImageSharp_ArrayPoolMemoryAllocator	20	734.9 ms	174.4 ms	9.56 ms	0.24	1000.0000	1000.0000	1000.0000	1,237,298 KB
Magick	20	1,545.1 ms	624.1 ms	34.21 ms	0.51	-	-	-	106 KB
MagicScaler	20	389.5 ms	163.7 ms	8.97 ms	0.13	-	-	-	101 KB
SkiaBitmap	20	889.6 ms	182.9 ms	10.03 ms	0.30	-	-	-	118 KB
NetVips	20	441.4 ms	686.1 ms	37.61 ms	0.15	-	-	-	102 KB

VirtualAlloc commit lifetimes graph with ArrayPoolMemoryAllocator

VirtualAlloc commit lifetimes graph with the new allocator, demonstrating the trimming

VirtualAlloc commit lifetimes graph with pool size set to zero

I would be happy to see some expert feedback on this solution, especially for the finalizer tricks.

/cc @Sergio0694 @saucecontrol @br3aker

br3aker · 2021-08-08T18:25:17Z

(We need the timer since unmanaged allocations don't trigger GC)

Might be worth testing trimming solely via GC.AddMemoryPressure & gen2 callback?

antonfirsov · 2021-08-08T18:30:11Z

@br3aker I'm not sure if AddMemoryPressure / RemoveMemoryPressure contributes to Gen2 allocation budget or not. Might make sense to try out, but I think the timer is not that expensive, and trimming is configured to be done every 1 minute (mimicking ArrayPool.shared) anyways.

Directory.Build.props

codecov · 2021-08-08T22:17:46Z

Codecov Report

Merging #1730 (5eaa632) into master (4587649) will increase coverage by 0%.
The diff coverage is 89%.

@@           Coverage Diff           @@
##           master   #1730    +/-   ##
=======================================
  Coverage      87%     87%            
=======================================
  Files         935     944     +9     
  Lines       49300   49753   +453     
  Branches     6102    6165    +63     
=======================================
+ Hits        43175   43575   +400     
- Misses       5115    5157    +42     
- Partials     1010    1021    +11

Flag	Coverage Δ
unittests	`87% <89%> (+<1%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/ImageSharp/Advanced/AdvancedImageExtensions.cs	`84% <ø> (+24%)`	⬆️
src/ImageSharp/Common/Helpers/DebugGuard.cs	`0% <0%> (ø)`
...eSharp/Common/Helpers/Shuffle/IComponentShuffle.cs	`100% <ø> (ø)`
src/ImageSharp/Common/Helpers/SimdUtils.Shuffle.cs	`92% <ø> (ø)`
...icInterpretation/WhiteIsZero24TiffColor{TPixel}.cs	`0% <0%> (ø)`
...rp/Memory/Allocators/Internals/BasicArrayBuffer.cs	`100% <ø> (ø)`
...Sharp/Memory/Allocators/SimpleGcMemoryAllocator.cs	`66% <ø> (-14%)`	⬇️
src/ImageSharp/Memory/UnmanagedMemoryManager{T}.cs	`62% <0%> (ø)`
...s/Convolution/Convolution2PassProcessor{TPixel}.cs	`61% <40%> (ø)`
...harp/Memory/Allocators/Internals/Gen2GcCallback.cs	`44% <44%> (ø)`
... and 118 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4587649...5eaa632. Read the comment docs.

JimBobSquarePants · 2021-08-09T14:45:54Z

Cannot wait to get stuck into reading this! 🤩

antonfirsov · 2021-08-09T14:57:07Z

@JimBobSquarePants been thinking a lot & chatting with folks on the C#/lowlevel discord channel, I'm no longer sure if using unsafe memory is the right thing to do.

The problem is that an update to 1.1 may turn minor bugs and performance issues into security errors for users of GetRowSpan(y) and TryGetSinglePixelSpan(). These are methods which look completely safe, but will become inherently unsafe with this PR. (See the warnings added to docs in c9d1396)

On the other hand, SkiaSharp merged a similar API without spending a single minute on security concerns: mono/SkiaSharp#1242

JimBobSquarePants · 2021-08-09T15:05:10Z

@antonfirsov That's the IMemory<T> finalizer warning yeah? That would require some impressively bad code to trigger but you're right we should do a careful review.

I need to do some reading there and maybe see if we can get someone from the runtime team who worked in that area to comment.

antonfirsov · 2021-08-09T15:14:17Z

Strictly speaking, the finalizer warning doesn't apply to us since the SafeHandle is finalizable and returns memory anyways.

That would require some impressively bad code to trigger

First I also thought so, but in fact it can be as simple as:

var image = new Image<Rgba32>(w, h); // no using
Span<Rgba32> span = image.GetPixelRowSpan(0); // last use of the object `image`, finalizers may run after this point

// some relatively long running code here to allow the finalizers to finish

span[0] = default;  // memory corruption

JimBobSquarePants · 2021-08-09T15:17:30Z

I wonder if we could write an analyzer?

antonfirsov · 2021-08-09T15:23:30Z

I need to do some reading there and maybe see if we can get someone from the runtime team who worked in that area to comment.

That would be awesome, but I'm afraid they would share the concerns around unsafe memory and push back.

I wonder if we could write an analyzer?

Would it work out of the box just by using the library? Note that it won't help existing users doing a package update without recompilation, and then running into a potential security issue.

JimBobSquarePants · 2021-08-10T07:01:04Z

The trick, I think, would be to make the analyzer a dependency of the main library Like Xunit do.

Have we made any breaking changes that require recompilation? Maybe we should just to ensure people should rebuild. 👿

antonfirsov · 2021-08-10T12:27:12Z

GrabYourPitchforks — Yesterday at 6:09 PM
The original design for Memory didn't have a span accessor; you had to provide a delegate and the span would be passed as a parameter to the delegate. This solved many (but not all) ownership problems.

There is a safe, breaking way to re-implement span accessors by using delegates, inspired by the comment above:

  public class Image<T>
  {
-     public Span<T> GetPixelRowSpan(int y);
-     public bool TryGetSinglePixelSpan(out Span<TPixel> span);
+     public void ProcessPixels(Action<PixelAccessor<T>> rowProcessor);
+     public bool TryProcessSinglePixelSpan(SpanAction<T> pixelProcessor);
  }

+ public ref struct PixelAcceessor<T>
+ {
+     Span<T> GetRowSpan(int y);
+ }

The simplest thing would be to go ahead with this breaking change and bump ImageSharp version number to 2.0. The improvements will justify the change.

JimBobSquarePants · 2021-08-10T12:32:03Z

bump ImageSharp version number to 2.0

2.0 was to be my kill all old target frameworks release. I want to ship a working V1 of Fonts and Drawing before starting work on it.

antonfirsov · 2021-08-10T12:34:20Z

That can be 3.0 then. We follow semantic versioning more or less, so no point to be afraid of major version jumps as breaking changes land IMO.

antonfirsov · 2021-08-10T12:43:10Z

But I'm also fine with a hard-breaking 1.1, this is more about PR and communication than anything else,

However, renaming the milestones seems to be better thing to do for me, we can even benefit out of it.

JimBobSquarePants · 2021-08-10T12:48:02Z

My only issue with jumping from 2.0 to 3.0 would that in real terms it would probably occur over a short timespan which, in my opinion does reflect well on the quality. 1.1 would be, by far, my next desired target.

This is a massive breaking change though so I'm deeply conflicted. 🙁

Sergio0694 · 2021-08-10T13:28:58Z

Disclaimer: I'm not home and with barely functioning internet, so might've missed some bits here.

I have a question regarding PixelAcceessor<T>, isn't that quite similar to what we already offer with the ProcessPixelRows___ APIs already? Given that those also allow working on arbitrary pixel types and offer batched and vectorized pixel format conversion, should we consider just pushing devs currently accessing individual pixels towards those APIs instead? If not and if we still want to offer a way to access individual pixel specific values, should we align the design of these new APIs to follow those, for consistency? Just some random thoughts here 😄

Btw amazing work on all this @antonfirsov, I'll also need to find some time to carefully go through all this like James said and have a proper read, as the whole investigation seems super interesting! 🚀

JimBobSquarePants · 2021-08-10T14:10:24Z

After careful consideration. I'm up for a V2 release. It's good opportunity to fix a few things plus we are already adding a significant amount of fixes/functionality to the release so let's make a show of it.

antonfirsov added 7 commits August 7, 2021 23:05

UniformUnmanagedMemoryPoolMemoryAllocator

7c1a1af

use WeakReference with timer, configure trim period for sandbox

ca17d33

fix trimming

fd94dbf

LoadResizeSaveParallelMemoryStress help text

1a41aaa

Set_MaximumPoolSizeMegabytes_CreateImage_MaximumPoolSizeMegabytes

aa26b63

MinimumContiguousBlockBytes -> MinimumContiguousBlockSizeBytes

172b0a0

disable CA2015

4251eac

antonfirsov added 2 commits August 8, 2021 22:49

comments and docs

c9d1396

fix .NET Framework build error + a few warnings

e13edba

antonfirsov commented Aug 8, 2021

View reviewed changes

Directory.Build.props Outdated Show resolved Hide resolved

antonfirsov added 3 commits August 8, 2021 23:09

disable MemoryGroupFinalizer_ReturnsToPool on MacOS

0cdbf72

make MemoryGroupFinalizer_ReturnsToPool Windows-only

20d6228

try to mitigate 32 bit failures

1a86d6d

This was referenced Sep 23, 2025

Bump the nuget group with 5 updates norschel/enterJSWebSecurity2025-Demo1#8

Open

Bump SixLabors.ImageSharp from 1.0.3 to 2.1.11 sajadbz/Sbz-clean-architecture#10

Open

This was referenced Oct 19, 2025

Bump the nuget group with 1 update ewdlop/bepuphysics2Fork#1

Open

Bump the nuget group with 1 update ewdlop/RaytracingFork#1

Open

	/// <summary>
	/// UnmanagedMemoryHandle's finalizer would release the underlying handle returning the memory to the OS.
	/// We want to prevent this when a finalizable owner (buffer or MemoryGroup) is returning the handle to
	/// <see cref="UniformUnmanagedMemoryPool"/> in it's finalizer.
	/// Since UnmanagedMemoryHandle is CriticalFinalizable, it is guaranteed that the owner's finalizer is called first.
	/// </summary>
	internal void Resurrect()
	{
	GC.SuppressFinalize(this);
	this.resurrected = true;
	}

	internal void AssignedToNewOwner()
	{
	if (this.resurrected)
	{
	// The handle has been resurrected
	GC.ReRegisterForFinalize(this);
	this.resurrected = false;
	}
	}

Uh oh!

Unmanaged pooling MemoryAllocator #1730

Unmanaged pooling MemoryAllocator #1730

Uh oh!

Conversation

antonfirsov commented Aug 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

UPDATE 2: Ready for review!

Prerequisites

Description

UniformUnmanagedMemoryPoolMemoryAllocator functional characteristics

Pool size

Trimming

Finalizers

API changes

Benchmarking methodology

Benchmark results

VirtualAlloc commit lifetimes graph with ArrayPoolMemoryAllocator

VirtualAlloc commit lifetimes graph with the new allocator, demonstrating the trimming

VirtualAlloc commit lifetimes graph with pool size set to zero

Uh oh!

br3aker commented Aug 8, 2021

Uh oh!

antonfirsov commented Aug 8, 2021

Uh oh!

Uh oh!

codecov bot commented Aug 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

JimBobSquarePants commented Aug 9, 2021

Uh oh!

antonfirsov commented Aug 9, 2021

Uh oh!

JimBobSquarePants commented Aug 9, 2021

Uh oh!

antonfirsov commented Aug 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JimBobSquarePants commented Aug 9, 2021

Uh oh!

antonfirsov commented Aug 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JimBobSquarePants commented Aug 10, 2021

Uh oh!

antonfirsov commented Aug 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JimBobSquarePants commented Aug 10, 2021

Uh oh!

antonfirsov commented Aug 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

antonfirsov commented Aug 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JimBobSquarePants commented Aug 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Sergio0694 commented Aug 10, 2021

Uh oh!

JimBobSquarePants commented Aug 10, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

antonfirsov commented Aug 8, 2021 •

edited

Loading

codecov bot commented Aug 8, 2021 •

edited

Loading

antonfirsov commented Aug 9, 2021 •

edited

Loading

antonfirsov commented Aug 9, 2021 •

edited

Loading

antonfirsov commented Aug 10, 2021 •

edited

Loading

antonfirsov commented Aug 10, 2021 •

edited

Loading

antonfirsov commented Aug 10, 2021 •

edited

Loading

JimBobSquarePants commented Aug 10, 2021 •

edited

Loading