-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: GC.RegisterMemoryPressureCallback #53895
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
If we maintain a strong ref to the callback, its entire object graph remains alive. So the delegate model might not be as fragile as I'd originally thought. |
@GrabYourPitchforks Yeah, my thinking was that consumers would have to ensure that the input delegate is either wrapping a static method, or a stateless lambda. I agree that that's potentially error prone, really my main concern with the interface-based pattern though is that it'd make existing patterns with a target instance much more verbose and even more error prone in some cases, unless I'm reading that wrong. From this last comment I think we might agree on this now then 😄 For reference (and for others reading), what I'm saying is that, assuming the target instance is a public type that you don't want consumers to see the interface applied to, then things get very complicated and error prone with the interface-based callback. Case 1 (function)public sealed class WeakReferenceMessenger
{
public WeakReferenceMessenger()
{
static void MemoryPressureCallback(object? target)
{
((WeakReferenceMessenger)target).Cleanup();
}
GC.RegisterMemoryPressureCallback(MemoryPressureCallback, this);
}
private void Cleanup() { }
} Case 2 (interface)public sealed class WeakReferenceMessenger
{
public WeakReferenceMessenger()
{
GC.RegisterMemoryPressureCallback(new CleanupCallback(this), true);
}
private sealed class CleanupCallback : IMemoryPressureCallback
{
private readonly WeakReference<WeakReferenceMessenger> target;
public CleanupCallback(WeakReferenceMessenger target)
{
this.target = new WeakReference<WeakReferenceMessenger>(target);
}
public bool Invoke()
{
if (target.TryGetTarget(out WeakReferenceMessenger? messenger))
{
messenger.Cleanup();
return true;
}
return false;
}
}
private void Cleanup() { }
} This seems way more verbose and actually more error prone to me, as consumers would have to remember to set the reference to be strong, because they'd actually be referencing the callback wrapper and not the actual target instance. They'd then also have to manually have a weak reference for the actual target, and then also check whether that's alive in the callback 🤔 The other API shape just seems overall much easier to use and with also less chances to get things wrong in general. Thoughts? |
Would supporting unregistering a callback make sense? The method could return an |
Tagging subscribers to this area: @dotnet/gc Issue DetailsBackground and MotivationThere are a number of scenarios where library authors would like to perform some kind of cleanup/trimming of data structures or other objects, with the assumption that this operation should ideally run when there's enough memory pressure and as long as some target object is alive (typically the object that's being trimmed). An example of this is
The current approach relies on just copying the
This proposal is for a proper, built-in API to register a callback to be executed when there's memory pressure. This means that the runtime/GC would also be free to decouple the execution from just strictly when a gen 2 collection occurs, leaving more space to possibly make decisions on how to execute the callback as well. It would also be easier to use for consumers, and more reliable (for instance, running arbitrary code from a finalizer is not really the best idea in the first place). Proposed APIThe API shape I had in mind was to essentially just expose the same features as namespace System
{
public static class GC
{
public static void RegisterMemoryPressureCallback(Func<object?, bool> callback, object? target);
}
} The state can either be This API would be easy to use and it would support all the existing scenarios as with Usage ExamplesSee any of the existing Alternative DesignsHere's is an alternative proposal with an interface-based callback from @GrabYourPitchforks: namespace System
{
public interface IMemoryPressureCallback
{
bool Invoke();
}
public static class GC
{
public static void RegisterMemoryPressureCallback(IMemoryPressureCallback callback, bool strongReference);
}
} Quoting from Levi:
My main issue with this design is that, quoting myself from our previous discussion:
RisksNone that I can see. The API would also be in the
|
In case the assembly used to define the delegate is collectible (and meant to be collected), wouldn't this design prevents the assembly from being unloaded? |
Re: In the case of ASPNET, we had a type |
The delegate would be strongly held until the next memory pressure event, then it would be invoked. The delegate may choose to reregister itself by returning true. If it does not reregister itself, its becomes eligible for collection, which would presumably then allow the entire assembly to be unloaded. |
Related to this, also if the delegate was stateful and targeting an object that's in that assembly as well, then it'd simply be automatically removed on the next event. As in:
If the delegate was instead not stateful then yeah you'd just have to manually return |
If we're talking about callback models we probably also have to figure out ExecutionContext stuff, flow suppression, blah blah. |
I mean my thinking here was to open the proposal to start a discussion and get an area owner to have a look and comment on whether that'd be something that would be reasonable to add at all, and if so then go through the various open questions and details to get the fancy But sure going through some more details already sounds great! Can you elaborate a bit more on those points you raised? In general my idea for the API was that it wouldn't give you any guarantees on where the callback would be invoked. As in, the runtime would be free to invoke it from any arbitrary thread (not necessarily the one that registered it, and consecutive invocations of the same callbacks wouldn't give guarantees to use the same thread either) at some point in time where it considered memory pressure to be relevant (eg. before/after a gen 2 collection, or with whatever other heuristics) 🤔 |
Re: flowing context, normally when we capture a delegate for execution on a separate thread, we're expected to flow the ExecutionContext. This captures async locals, thread culture and identity, and a few other thread-ambient characteristics, and it restores those on the target thread before the delegate is invoked. The pattern we use is that these sorts of methods capture the EC by default, and we have sibling Unsafe*-prefixed methods that don't capture the EC. If we wanted the behavior of this API to be that it never captures the EC, that's fine. We'd likely have only the Unsafe*-prefixed method in that scenario. And if the caller really wants to capture the EC, they can do so manually. |
As discussed on Discord as well, yeah that sounds totally reasonable to me, especially given that I would expect that to simplify the implementation and give the runtime more freedom to handle things internally however it needs to. Also, with respect to all those existing use case scenarios of |
For context, the reason the EC stuff matters here is that as an implementation detail we may want to consider dispatching all of these callbacks via |
I see, yeah that makes perfect sense, thanks for the additional info! |
Was wondering whether @Maoni0 or someone from the GC team could share some thoughts on the proposal, now that planning for .NET 7 is starting again. Specifically, assuming the design @GrabYourPitchforks and I landed on looks reasonable, I was wondering whether the issue could be assigned the .NET 7 milestone so it could be looked at during internal planning, and/or it could get the "api ready for review" tag in case the proposal seemed acceptable as is. Or if that wasn't the case, if you could share some feedbacks on what to improve or change if needed, so we could iterate on it and work towards making it ready for review. Thanks! 😄 |
We have number of similar low-level global callbacks that do not capture execution context. For example, AppDomain.UnhandledException or NativeLibrary.SetDllImportResolver. I do not think that it makes sense for this API to have |
We have used different pattern to solve the same problem in the recent PosixSignal APIs: #50527 (comment) . Should we have some sort of consistency on how we are exposing callbacks like this? |
Does the API need to communicate the severity of the Memory pressure and/or allow only registering for certain levels? As is, the API would not really be sufficient replacement for what is done in ArrayPool:
|
Oh, right. I don't have a strong preference on either, especially given that the behavior would properly be documented in the API anyway and I think it'd be reasonable to assume that developers using this API would take the time to read its docs first. I liked the
Some reasons for the current proposal were, in no particular order:
For the current proposal, the goal was to offer a proper replacement for the |
the whole GC team is still heads down with the .NET 6 work. but that shouldn't prevent you from having a discussion about this - it just means someone from the GC team will look at the discussion in the not too distant future. |
Thank you for chiming in! And yeah I mostly just wanted to start a conversation again on this to get more feedbacks and eventually settle on a good API proposal for FXDC to review. Given that we cut it pretty close with Jan raised some great points (unsurprisingly 🙂) so I'm curious to hear back on that, and then I'll be looking forward for the GC team to also share some thoughts on this once work for .NET 6 on that end is done as well, as you mentioned. Thanks! |
I'm not sure this is actually a good reason. If this is about preventing code breakage (semantic versioning) should you wish to no longer implement GC pressure callbacks, you can still keep implementing the interface but never register with the GC. If it's about not wanting to expose the fact that your code does this at all, then I'd have to ask why? If anything, it makes the behavior of the code more introspective (a developer wondering if it's a terrible idea to use your cache or whatever can see that it implements
This is borderline bikeshedding, but I saw that as mainly to distinguish the EC-free routine from the existing On Maybe it would be best for the callback to look more like This would avoid a thundering herd responding to the GC pressure callback with their own calls to |
This would strictly be an internal implementation detail for types supporting this, so I really don't feel like this should leak through the public API surface. It's not so much about consumers being able to explicitly calling the cleanup method on their end (though that would also not necessarily be good), but more about this just looking like bad design to me. Not to mention as I said the fact that you'd be somewhat limited if you wanted to later on change the way the type does cleanup, as you'd then have to keep that dummy interface there to avoid breaking consumers using your type.
I mean, as I mentioned I mostly liked that suggestion due to the reason Levi mentioned, but I wouldn't really mind if we ditched the
I actually don't mind this idea, especially after seeing that other types we used as reference, such as Lines 187 to 190 in d40fbf4
As a strawman updated proposal, I guess we could have something like this: namespace System
{
public static class GC
{
public static void RegisterMemoryPressureCallback(Func<object?, GCMemoryInfo, bool> callback, object? target);
}
} At this point my concern though is that a Another open question: would the fact that all invoked delegates would get the same memory info be important? Eg. in case one target did some trimming and the GC run right before the next one, then that memory info would become stale. Would that be an issue, or would that be considered fine given that handlers would always be assumed to be invoked in a non-specific order? |
@Sergio0694, @Wraith2, @antonfirsov, @rickbrew and others who might be interested. I have a prototype branch that should work and would like validation from your scenarios! Check out the documentation as well to get started. |
Will it trigger multiple times once 1 callback is registered every time it needs to do a Gen2 GC. Also what if most of the memory is inside of a static MemoryCache variable? Can we use that callback to then force that cache to be cleared then since I think the GC ignores static memory? I also run into this problem where I basically memory cache some discord event information in order to reduce the usage of api requests (to avoid getting temp api banned on their end by hitting 10k ratelimited requests in under 10 minutes) so having an api like this where the memory cache does not get cleared too early or never clears at all will help me as well. Especially in cases where I might have duplicates of the same object that updated and only want to keep only the newest items of a specific type of data returned from Discord for specific users, roles, channels, servers, etc. |
Right now, the same callback will be called multiple times when the GC believes we are running low on memory. There are other cases that a blocking gen 2 GC may happen (e.g. user is calling
When you say public class MemoryCache
{
public static byte[] data = new byte[10000];
} Assuming this is the only reference to the byte array, setting it to null will should let the GC collects it in the next collection. public void Callback() {
MemoryCache.data = null;
} |
Yes I mean that the variable / property that holds the data is a C# static. However they are not always nullable in my codebase (some of them are list based) and as such I have to then calculate which items in the list I need to evict in said list like collection (perhaps I could make each item have an embedded timestamp of when it was added to it and then compare similar items with that timestamp to keep only the newest one and collect the older ones). |
My prototype is only capable to give a callback, what does the callback do the relieve the memory pressure is flexible and you can do whatever makes sense for your application. Your idea to eliminate entries that are least recently used seems very reasonable to me. |
Hey @cshung, this is awesome! Thank you for putting together an initial prototype 😄 I'll see if I have time to actually test this in one of my scenarios soon, had a few feedbacks in the meantime.
I think we absolutely need support for multiple callbacks for this to be a viable API to actually support, yes. Storing the callback(s) in managed code makes sense to me, especially considering the idea was to stop using the finalizer thread to run the callbacks, and use something like Regarding the "who keeps the callback alive?" point and the originally proposed signature, do we all agree on how should the various registered components be conceptually linked together? My assumption was to have something like this:
This should make it so that:
Thoughts on this? Curious to hear if there's any other approaches any of you had in mind for this 🙂 |
I was just going to suggest that. I recently solved something similar in Paint.NET: a static event registrar, where I store a |
This callback would not be sufficient for ArrayPool. ArrayPool would keep using |
Would you mind elaborating on what is missing? |
ArrayPool trimming is triggered by more than just low memory pressure. |
I am thinking about this flow of events:
Right now I have only done 1 and 2. 3 and 4 are outside of my regular development work and would take significant time for me to complete. I was hoping that either the community can help me with that, or we could experiment with the prototype and get the feedback we needed without that. This will probably lead to a much faster iteration time than I completed them all but it doesn't solve the problem. Just to re-iterate, this is meant to be a prototype. Of course, I agree we will need multiple callbacks when we eventually release the feature if we decide to. |
Got it, it probably doesn't make sense to trim only when we run out of memory. This API wasn't intended to be the only signal for trimming ArrayPool. It is intended to be a |
@cshung Love the enthusiasm and I'm really glad to see some traction on this proposal! I'd be happy to help out with 3) and 4) given those points would primarily only touch the managed runtime 😊 On this note: if we need another medium to have a quicker back and forth as we work on this, you'd be welcome in the |
Which components that the initial comment talks about would be able to replace |
I am not sure I understand, if I replace line 292 here Line 292 in f53c8dc
With the Then I miss some events that would have called |
Correct.
Similarly, |
I have a couple questions on this:
|
Here is how ArrayPool trimming works today: Every once in a while, the array pool goes through all pooled items and trims the one that has not been used for some time. The aggressiveness of the trimming is controlled by memory pressure. The Gen2 GCs are used as a convenient approximation for "every once in a while". It is based on a simplifying assumption that ArrayPool usage is likely corelated with regular allocations in typical app. If this API is implemented as a callback triggered by a small subset of Gen2 GCs, it is not going to better than
Good question. Here are my thoughts: The cache management is a policy heavy problem. There is no one universal good policy. For example, consider an app that handles bursty traffic. Good P99 latency is the most important metric of this app.. The app author deploys it on a dedicated VM with a plenty of memory. The cache trimming is counterproductive for an app like that. Cache trimming is just going to make P99 latency worse, without improving any metric that matters. On the other hand, consider a desktop app that is meant to be in the background most of the time, and only occasionally in the foreground. It is desirable for the caches to be trimmed while the app is in the background, so that it does not consume memory unnecessarily. Nobody likes to see apps running in the background to consume GBs of memory. We have been reactively introducing settings that allow developers to tweak policies of various caches in the system. RetainVM settings or RegEx.CacheSize API are examples of such settings. However, most caches (including arraypool) have hardcoded non-configurable policies. It is not unusual to see appmodels to try to trim memory based on interesting events. For example, Windows Phone or UWP apps triggered GC before getting suspended, to minimize working set of the suspended process. High-density freemium web hosting plays similar tricks sometimes. I think the set of APIs for cache management should:
Also, I think that it is important to be able to arbitrage effectively between different unrelated caches somehow. There are caches with items that do not occupy too much memory and take a lot of time to recreate (e.g. RegEx cache); on the other end, there are caches that occupy a lot of memory and take little time to recreate (e.g. ArrayPool). The system should ensure that the caches that are cheap to recreate are trimmed first and more aggressively. I have mentioned this issue above, but it did not seem to resonate. We can ignore this point for now. |
I do feel like we're moving a bit away from what the original proposal was, which is to offer a built-in replacement for Consider something like this: enum GcCallbackPolicy
{
// Basically "whenever Gen2GcCallback gets called today"
Periodic,
// Only when there's actually memory pressure
MemoryPressureOnly
}
static void RegisterGcCallback(Func<object?, bool> callback, object? target, GcCallbackPolicy policy); Now, the idea for this API shape is this:
Again, to be clear, not suggesting this API shape is ideal/final, just want to keep iterating on the API shape to see whether we can figure out what would one that would satisfy all these requirements that have been brought up look like 😊
|
That's not always a drawback. It is appropriate to run the cleanup on finalizer thread in some cases. For example, I do not think we would want to add more overhead to the arraypool trimming by scheduling it somewhere else. I expect number of other |
If if is the case, why is not WeakReferenceMessenger implemented this way today? If I am reading the code correctly, the WeakReferenceMessenger trimming is done always, without accounting for any memory pressure. Or why WeakReferenceMessenger needs the automatic trimming on memory pressure at all? Is it worth the overhead? |
I think this is perfect.
I feel like it's worth at least allowing to configure the trimming policy of |
I was writing an entire paragraph on why the suggestion is good, but |
Re @jkotas:
The rationale for that was (after discussing with @GrabYourPitchforks) that it could've avoided a number of issues that can arise from code running in the finalizer thread, if people are not very careful about what they're doing there. For instance, taking a lock on the finalizer thread is a very bad idea, but not everyone knows that. I guess one could also make the argument that people using such an API would be aware of these details as well though, sure. Would you imagine this API to just always invoke callbacks inline on the finalizer thread, or maybe that it would expose an option to choose where to be dispatched?
That is correct, today there are no checks for memory pressure. One of the reasons was that since we can't take a lock on the finalizer thread, we're just trying to acquire one to then do the cleanup. I wanted to avoid cases where the callback is only invoked when there's really a lot of pressure, then it fails to acquire the lock because someone else holds it at that moment, and then the trimming is just skipped entirely. This is certainly something that could be improved though. As to why it needs automatic trimming in the first place, the rationale here is that the type should be as easy to use as possible, with consumers not having to care about the implementation detail or with manually trimming it. The reason why it needs trimming is that registered recipients can be collected at any time (as they're weak references), which can result in leftover items in the internal mapping (ie. key/value pairs for message types that actually hold no recipients at all). Trimming just ensures the internal data structures are kept as lean as possible, which both saves a tiny bit of memory and can speedup broadcast operations as they won't have to waste time going through empty collections looking for actual recipients to send messages to. Re. @AraHaan:
I feel like this is just a misunderstanding due to me using a bad name for that |
Taking a long lock on a global singleton data structure is a very bad idea. It does not matter which thread you use to take the lock. It is fine to take a short lock on the finalizer thread. It happens a lot. In fact, the ArrayPool trimming takes a short lock on the finalizer thread too.
Right. |
To be clear, the lock is on an internal data structure, not on the publicly exposed singleton instance. Conceptually it should be the same as eg. Also, to further clarify this, using the singleton instance is not really the recommended approach anyway. That property mostly exists to make the transition easier for MvvmLight users, which were used to the singleton instance for the previous messenger type there. The recommended approach for the MVVM Toolkit is ideally to just inject the messenger with DI into viewmodels. Not also including the singleton would've made the migration too difficult for existing users though, we we added it for convenience.
I guess I might've misunderstood how bad this was when I talked about it with @GrabYourPitchforks a while back. |
Background and Motivation
There are a number of scenarios where library authors would like to perform some kind of cleanup/trimming of data structures or other objects, with the assumption that this operation should ideally run when there's enough memory pressure and as long as some target object is alive (typically the object that's being trimmed). An example of this is
TlsOverPerCoreLockedStacksArrayPool<T>
in the runtime, that leverages the internalGen2GcCallback
type to automatically execute trimming whenever a gen 2 GC collection occurs. Other library authors (myself included) have had the need to achieve something like this as well, for instance:WeakReferenceMessenger
inMicrosoft.Toolkit.Mvvm
(link)Ben.StringIntern
(link)The current approach relies on just copying the
Gen2GcCallback
type, but that has a number of issues:This proposal is for a proper, built-in API to register a callback to be executed when there's memory pressure. This means that the runtime/GC would also be free to decouple the execution from just strictly when a gen 2 collection occurs, leaving more space to possibly make decisions on how to execute the callback as well. It would also be easier to use for consumers, and more reliable (for instance, running arbitrary code from a finalizer is not really the best idea in the first place).
Proposed API
The API shape I had in mind was to essentially just expose the same features as
Gen2GcCallback
, but with a single API:The state can either be
null
, or some instance. If it's an instance, then the GC will keep that reference as a weak reference (passing this instance to this API will not keep that state alive after the call toRegisterMemoryPressureCallback
), and also automatically unregister the callback when the object is collected. Meaning that if you do pass a target instance, then the callback can assume that the input value will never benull
. This would allow consumers to achieve the same as the two overloads forGen2GcCallback
: either just pass anull
target and ignore the input state in the callback, which would just act as a static and stateless callback, or pass some target instance and then get it as input in the callback.This API would be easy to use and it would support all the existing scenarios as with
Gen2GcCallback
.Usage Examples
See any of the existing
Gen2GcCallback
usages linked above.Alternative Designs
Here's is an alternative proposal with an interface-based callback from @GrabYourPitchforks:
Quoting from Levi:
My main issue with this design is that, quoting myself from our previous discussion:
Open questions
Unsafe
prefix be used for the API to indicate no execution state capturing? (Jan suggested no)GCMemoryInfo
param given that most consumers would likely need it?Func<T1, T2, TResult>
fine or would we want to declare a custom delegate (maybe nested withing the GC type)?Risks
None that I can see. The API would also be in the
GC
type which is only ever used by more advanced users. Either of the proposed APIs would also be much more reliable and generally better than usingGen2GcCallback
like is the case today. Additionally, not having arbitrary code being run from within the finalizer queue reduces the chance of possible other issues (eg. someone locking from there or something).The text was updated successfully, but these errors were encountered: