-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not use AllocHGlobal in Pkcs12Reader #97447
Conversation
Tagging subscribers to this area: @dotnet/area-system-security, @bartonjs, @vcsjones Issue DetailsMake more targetted fix for #93647
|
@@ -39,19 +39,13 @@ protected void ParsePkcs12(ReadOnlySpan<byte> data) | |||
|
|||
// Windows compatibility: Ignore trailing data. | |||
ReadOnlySpan<byte> encodedData = reader.PeekEncodedValue(); | |||
byte[] dataWithoutTrailing = GC.AllocateUninitializedArray<byte>(encodedData.Length, pinned: true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I have the same concerns here as I did with the other I reviewed. The pinned object heap is not designed for short-lived objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 This is replacing memory leak with a perf bug that can be worse than the memory leak.
Can you wrap the unmanaged memory in a safe handle and AddRef/Release the safe handle around accesses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a perf expert but this is not a hot path by any means - I suppose all operations done later with cert are way heavier than any cost of this specific allocation/freeing - note there is also lots of parsing going on even within constructor but that's without doing any actual crypto related stuff.
I honestly think we should either change code so that pinning is not needed at all or invest in changes which make this construct fast (I'd imagine passing bytes to native code isn't as uncommon).
I personally greatly prefer simplicity to small perf gains but if there is evidence this is as bad then maybe it is worth it.
When I run this in a tight loop (that was just ctor not E2E) with different cert sizes OOM was very visible while this potential perf issue wasn't particularly noticable (but I didn't measure the throughput) - I'm having a really hard time seeing how it can be worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't about the perf of this code. It's about the effect it has on the rest of the application.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's about the effect it has on the rest of the application.
Just to demonstrate what I'm talking about, here it's not the Mean that I care most about (though that it is significantly slower and reflective of the impact this has on not only this code but other code in the process as well), but more the fact that in the non-pinned version there are 0 gen1 and 0 gen2 GCs, and that's no longer true in the pinned version: the POH gets collected as part of gen2, so short-lived allocations that would otherwise be gen0 become gen2, while also leading to excessive fragmentation of the POH, and that affects everything else in the app.
Method | Length | Mean | Error | Ratio | Gen0 | Gen1 | Gen2 |
---|---|---|---|---|---|---|---|
NotPinned | 10 | 7.054 ns | 0.0653 ns | 1.00 | 0.0004 | - | - |
Pinned | 10 | 234.552 ns | 4.7442 ns | 33.26 | 0.0014 | 0.0014 | 0.0014 |
NotPinned | 1000 | 78.747 ns | 0.7053 ns | 1.00 | 0.0111 | - | - |
Pinned | 1000 | 1,175.699 ns | 23.2682 ns | 15.03 | 0.0267 | 0.0267 | 0.0267 |
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);
[MemoryDiagnoser]
public class Tests
{
[Params(10, 1000)]
public int Length { get; set; }
[Benchmark(Baseline = true)]
public byte[] NotPinned() => GC.AllocateUninitializedArray<byte>(Length, pinned: false);
[Benchmark]
public byte[] Pinned() => GC.AllocateUninitializedArray<byte>(Length, pinned: true);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity - does allocating unpinned and then using scoped fixed
statement make any difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity - does allocating unpinned and then using scoped fixed statement make any difference?
Yes, that would work just fine in this case. However, I am not sure whether it is the right choice for this code. I assume that it allocates pinned memory to prevent GC from creating copies of the certificate in process memory when it compacts the heap. Comment that explains why this allocates pinned memory would be nice.
Why aren't pinned objects (or at least arrays) implemented in terms of GlobalHAlloc?
It would not be possible. GlobalHAlloc allocations are not managed by the GC. You have to free the allocation explicitly. Pinned object heap allocations are managed by the GC. They are collected when no longer referenced, there is no explicit free operation for you to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, the pinned object heap would not have these perf characteristics and its performance characteristics would be more like GlobalHAlloc. There is nothing fundamental preventing that. It is not how it behaves today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that it allocates pinned memory to prevent GC from creating copies of the certificate in process memory when it compacts the heap.
Yep. Well, the concern is with the private key material instead of the certificates per se, but the point is to avoid GC compaction clones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub @jkotas @bartonjs I've updated the code back to the old version and I'm using SafeHandle now, please review
eabfcf2
to
4a4cfe2
Compare
4a4cfe2
to
aa43fe8
Compare
/// <summary> | ||
/// SafeHandle for LocalAlloc'd memory which calls ZeroMemory when releasing handle. | ||
/// </summary> | ||
internal sealed class SafeLocalAllocWithClearOnDisposeHandle : SafeCrypt32Handle<SafeLocalAllocWithClearOnDisposeHandle> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
....Security.Cryptography/src/System/Security/Cryptography/X509Certificates/UnixPkcs12Reader.cs
Outdated
Show resolved
Hide resolved
} | ||
|
||
public unsafe PointerMemoryManager<byte> GetMemoryManager() | ||
=> new PointerMemoryManager<byte>((byte*)handle, _length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is bug prone to hand out the pointer to the resource managed by the handle like this. What is going to guarantee that the handle is not Disposed while the caller uses the memory manager?
The proper pattern to use with safe handles looks like this:
bool success = false;
try
{
handle.DangerousAddRef(ref success);
// get the span out the handle and do the operation on the memory
// owned by the handle. Do not pass any pointers to the memory outside
// of this scope
}
finally
{
if (success)
handle.DangerousRelease();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately that approach would require a bit of refactoring since try..finally would need to be outside of the UnixPkcs12Reader and size is unknown outside. I'll move the method out of the handle class so it's not accidentally misused in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will also add Dangerous prefix and disclaimer in the xml doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do not use this pattern, you are not getting any additional safety from using the safe handle and the safe handle is just an unnecessary overhead. You can keep store the pointer to memory directly and free it in the finalizer like what was implemented in #93647.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've ended up adding UnmanagedCryptBufferMemoryManager to isolate the pattern correctly - I'm still having hard time seeing why I should use DangerousAddRef (that honestly seems worse than just using native handle as you need to count references by hand) - I ended up using pattern described in here: https://learn.microsoft.com/en-us/dotnet/standard/garbage-collection/implementing-dispose . I'd prefer we do both solutions - mine focuses on making this buffer less error prone but I think we should still go ahead with changes proposed in #93647 but I don't think we should be touching GlobalHAlloc directly anywhere in this (Pkcs12Reader) code as it seems very fragile to refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SafeHandle wraps a IntPtr handle of the unmanaged resource and a thread safe counter. DangerousAddRef/DangerousRelease that are expected to be called around the code that uses the unmanaged resource increment/decrement the counter, and the SafeHandle Dispose method checks whether the counter is zero. If it is non-zero, it means that some other code is using the unmanaged resource. The safe handle will delay the release of the unmanaged resource until the counter drops to zero.
So if you are not calling the DangerousAddRef/DangerousRelease when using the unmanaged resource, SafeHandle is a just an expensive no-op wrapper for an IntPtr.
I understand that the manual AddRef/Release around the uses may look painful, but it is what it takes to deal with unmanaged resources without risk of race conditions or leaks in .NET.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if you are not calling the DangerousAddRef/DangerousRelease when using the unmanaged resource, SafeHandle is a just an expensive no-op wrapper for an IntPtr.
And worse than a nop, it makes it more likely that you'll have use after free bugs, because without the ref counting from the add/release, if the SafeHandle is dropped while you're using the pointer, its finalizer may release the memory still in use.
#98331 went with the approach of just fixing the data flow to ensure Dispose was always called; the strategy of using native memory is unchanged. |
Make more targetted fix for #93647