-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep parameter values out IMemoryCache in RelationalCommandCache #34803
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yinzara, this looks like a good fix - see minor comments below.
test/EFCore.Relational.Tests/Query/Internal/RelationalCommandCacheTest.cs
Outdated
Show resolved
Hide resolved
bc4f507
to
201dfe2
Compare
All done now. Sorry about all the force pushes :) |
Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes dotnet#34028
All requested modifications have been complete. All checks have passed. Ready for merge when you are. I know that this defect has technically existed for many prior versions of .NET. Do we still port defects to the Can I help in doing that or is this something only you guys do? |
@yinzara the enumeration of the dictionary - to create the array from it - isn't guaranteed to have the same ordering every time (dictionary ordering is indeterminate), so the arrays could have ended up having the values in different positions. I pushed a commit that goes back to using a simple dictionary for now, and cleans up various remaining things - please take a look and tell me if you see any issues. Regarding efficiency, this whole mechanism needs to be optimized: the parameter nullness should be a single efficient bitmap, along with an additional thing for the FromSql object arrays. But for now we're looking for a simple fix - one that could potentially be backported too - so I'm trying to minimize change and risk. |
6.0 is still in support, though only until November (support policy), and the bar is quite high. We'll discuss in the team if we want to backport this and to which exact versions; in any case, that's something we'll do on our side - thanks for the offer to help though. |
@dotnet/efteam would appreciate a review here as well, as it's a possible servicing candidate and needs to be accurate... |
I had a prior commit that uses ValueTuple<string,ParamValueInfo> that
worked that handled the sorting and was asked to change it :-) but ok
…On Mon, Oct 7, 2024, 12:01 PM Shay Rojansky ***@***.***> wrote:
@yinzara <https://github.com/yinzara> the enumeration of the dictionary -
to create the array from it - isn't guaranteed to have the same ordering
every time (dictionary ordering is indeterminate), so the arrays could have
ended up having the values in different positions.
I pushed a commit that goes back to using a simple dictionary for now, and
cleans up various remaining things - please take a look and tell me if you
see any issues. Regarding efficiency, this whole mechanism needs to be
optimized: the parameter nullness should be a single efficient bitmap,
along with an additional thing for the FromSql object arrays. But for now
we're looking for a simple fix - one that could potentially be backported
too - so I'm trying to minimize change and risk.
—
Reply to this email directly, view it on GitHub
<#34803 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFEA32JLIBDOWTXISIO2O3Z2LLCNAVCNFSM6AAAAABPEZRXF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJXGY3DQMBYGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@yinzara my bad - sorry for the extra work. Yeah, I thought about doing sorting but decided to do the simplest/safest thing here for now (again we can optimize later). |
If you still want it: private readonly struct CommandCacheKey
: IEquatable<CommandCacheKey>
{
private readonly Expression _queryExpression;
private readonly (string, ParameterValueInfo)[] _parameterValues;
internal CommandCacheKey(Expression queryExpression, IReadOnlyDictionary<string, object?> parameterValues) {
_queryExpression = queryExpression;
_parameterValues = new (string, ParameterValueInfo)[parameterValues.Count];
var i = 0;
foreach (var (key, value) in parameterValues)
{
_parameterValues[i++] = (key, new ParameterValueInfo(value));
}
Array.Sort(_parameterValues);
}
public override bool Equals(object? obj)
=> obj is CommandCacheKey commandCacheKey
&& Equals(commandCacheKey);
public bool Equals(CommandCacheKey commandCacheKey)
{
// Intentionally reference equal, don't check internal components
if (!ReferenceEquals(_queryExpression, commandCacheKey._queryExpression))
{
return false;
}
Check.DebugAssert(
_parameterValues.Length == commandCacheKey._parameterValues.Length,
"Parameter Count mismatch between identical expressions");
for (var i = 0; i < _parameterValues.Length; i++)
{
var (thisKey, thisValue) = _parameterValues[i];
var (otherKey, otherValue) = commandCacheKey._parameterValues[i];
Check.DebugAssert(
thisKey == otherKey,
"Parameter Name mismatch between identical expressions");
if (thisValue.IsNull != otherValue.IsNull)
{
return false;
}
if (thisValue.ObjectArrayLength != otherValue.ObjectArrayLength)
{
return false;
}
}
return true;
}
public override int GetHashCode()
=> RuntimeHelpers.GetHashCode(_queryExpression);
}
// Note that we keep only the nullness of parameters (and array length for FromSql object arrays), and avoid referencing the actual parameter data (see #34028).
private readonly struct ParameterValueInfo
{
public bool IsNull { get; }
public int? ObjectArrayLength { get; }
internal ParameterValueInfo(object? parameterValue)
{
IsNull = parameterValue == null;
ObjectArrayLength = parameterValue is object[] arr ? arr.Length : null;
}
} |
Not quite sure if doing Array.Sort() on an array of tuples is right... In any case, I'd rather leave things closer to the current implementation for now. The null-ness information really should just be a bitmap in any case. |
I promise this is my last rebuttal :-) and feel free to not read it and just do what you want as I completely agree this is not my project. I won't respond again. Since C# 7, ValueTuple(s) have implemented IComparable and if you read the code always compare each element using the default Comparer. Since we know that an IReadOnlyDictionary can only contain unique keys by definition, the first element will always be the only one necessary for sorting so we can know that the order of the comparison of keys will be consistent and predictable. The only difference in the algorithm in my solution from the existing solution is the use of the While I completely understand the goal is to minimize change for ease of portability, the entirety of the CommandCacheKey (and ParameterValueInfo) must be ported as a whole back as the entirety of the class has changed save a few choice lines and if you follow the commit history of this file back to 6.0 release, the entire file was reformatted so either solution should both be equally portable. Objectively your solution is less lines of code changed, I will completely agree. I just don't think it's more/less complex to migrate to older versions. I would hope the preferred outcome would be if we could avoid having to make any future changes for this issue while still having an ideal solution given the goals below. The goals of the solution are in order of importance:
Assumptions
Current Proposed solutionThere is only one property, a Dictionary. The dictionary has two arrays in it - int[] ( size 4xN bytes) + Entry[] ( size 8*N + 16 ( reference to string key) + (1 [nullness bool] + 5 [nullable int, length of array]) * N + 4 int properties (4x4) and a ulong property (8) = 13N + 40 bytes ) 2 object allocations approx 72+18N bytes Construction of a CommandCacheKey is an Order N (to iterate over the keys) * N Log N (to insert in dict, though might be more or less, hard to say). Equality test is an Order N (to iterate over the keys) * Log N (to lookup in the dict) operation. Potentially Most Efficient Solution using BitMapsWe would have a That's 2 object allocations (16 bytes * 2) plus N/4 (byte array) plus 5*N (int? array) 32+N/4+5*N bytes - basically half of your proposed solution In this solution we would need to iterate over the keys of the IReadOnlyByteArray during the construction of the CommandCacheKey in alphabetical order. This will require at least one object allocation and some memory (though it will be reclaimed after construction). That operation would be at least an order N operation potentially NLog N I think in the worse case. It would then be another 2O(N) operation to fill the bitmap and int array. The code to do so will also be relatively complex compared to your or my solution. So construction of CommandCache Key would be Order N*Log N + 2N with 3 object allocations Equality tests would be 2*O(N) My Proposed Solution1 object allocation (the tuple array) That's 1 object allocation (16 bytes) plus N (bool?) + 5N (int?) So my solution is 3 bytes more for every 4 elements in the key so 3/4 * N bytes more. For the default configuration of the Construction of a CommandCacheKey would be an N (iterate over the keys) + N (best case) Log N operation (worse case) Equality tests would be 3O(N) or 2O(N), though the difference is mostly irrelevant. Conclusions
So the only real difference between my solution and the bitmap solution is less than 1k of memory for the entire application in exchange for 2 unnecessary object allocations. As an engineer, I would definitely give up the 1k for the avoidance of the object allocations. That means I would say the bitmap solutions isn't actually ideal at all and my solution is the most ideal. It has nearly the lowest memory footprint, the lowest computational overhead, and is still of a low code complexity. |
@yinzara thanks for engaging - that's quite a lot of thinking/analysis on this topic...
Well, formatting changes aren't a big deal here; and in any case, It's not so much about the difficulty of backporting itself, as the size of the change we're backporting. Your proposal simply changes the code more significantly (switching away from a dictionary to a sorted array), so it's a bit riskier. I won't go into all the specific size/length calculations, as I think you've made some assumptions here and overlooked some optimizations... If we really want to optimize this, we would use BitVector32 when the number of parameters is <= 32 (virtually all cases) - note that it's a value type, zero allocations - and fall back to BitArray for more than 32 parameters. The array lengths would be nullable, so that queries without an Further, from a CPU perspective, in the common case we'd be comparing BitVector32, which is equivalent to comparing two uints (as efficient as one can get). Even for the other cases, when comparing simple arrays, looping over an array and performing two equalities as you're doing above is unlikely to be efficient; doing a single memcmp on a memory block is likely to be far better. This can be done in .NET via Span.SequenceEquals - see https://stackoverflow.com/questions/43289/comparing-two-byte-arrays-in-net/48599119#48599119. To summarize, it's really worth distinguishing between two things: (1) fixing the current bug (and backporting it), and (2) optimizing the cache key. Trying to do both at the same time is riskier, and we really try to prioritize the stability of existing versions, so we're very conservative with the changes we patch. The solution you're proposing above doesn't optimize things as much as it could, and at the same time represents more of a departure from the current implementation than is necessary (and therefore more risk). I'll keep this open in case you have any additional comments (those are always welcome!), and also because we need to decide whether we'll be backporting this (I'll discuss this with the team tomorrow). Of course, it's fine for you (or anyone) to have a different opinion on the above! In fact, I'd be happy to accept a PR from you later which does the additional optimization pass |
I very much appreciate the thoughtful reply. Makes me want to continue to contribute. |
…net#34803) Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes dotnet#34028 Co-authored-by: Shay Rojansky <roji@roji.org> (cherry picked from commit af420cd)
Only store necessary info for parameter values RelationalCommandCacheKey to prevent memory leaks.
This a relatively serious memory leak defect and needs to be ported back to the
release/8.0
branch as well.Fixes #34028