-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Look into whether parameter creation from a tensor leads to incorrect dispose scope statistics. #1397
Comments
@mvphelps -- I can't assign this to you, but could you take a look at the issue? |
Maybe there should be a |
@NiklasGustafsson Sure happy to look at it. To clarify, the code you show above is not present in the main branch. So is this a proposed implementation for Parameter? I do think this implementation makes perfect sense - much like PackedSequence, it should start owning the data tensor, and should also dispose it properly, all while managing the statistics correctly. Additionally the second ctor that accepts just the IntPtr handle for a Tensor, I'm thinking we just have the create the Tensor from that handle and then delegate to the other ctor. I'd also add another test fixture to deal specifically with stats for parameters. Edit: Also I do not mean adding a complete new set of statistics for Parameters, these are still Tensors. Is that what you are thinking? Just want to make sure I understand fully, before I go making the PR. Thank you. Also, could you explain the MoveHandle method on Tensor, since it is being used here? I see the code and it is simple but I don't understand why it's necessary and it has no comments. My guess on what it's doing is something unobvious with trying to avoid having multiple references to the same handle? Essentially taking the handle for the original tensor, and now making it the handle for the Parameter instead? And this method ensures only one reference on the IntPtr so GC will be able to do it's thing? |
I like the ReplaceWith idea - so it would just move the scope from the old object to the scope on the new object, and null out the scope on the old. The nice thing with that is it wouldn't actually increment anything. Seems like a pretty clean way to do it. |
This code is already in the main branch. Where it is failing is in a branch I'm working on, where many of the Modules are reimplemented in C# and using functional APIs to get to the native code. That means that the code needs to create Parameter instances, which are tensors used as module parameters. Parameter is a sub-class of Tensor, but you can't just downcast, so the construction of a Parameter takes a Tensor and then takes over its native tensor handle. This means that there needs to be a special treatment of the handle for this case. Anyway, it all works, including the Attach/Detach pair, except that it seems to mess with the new, more precise statistics gathering that you introduced. Here is, for example, the new constructor for Linear: internal Linear(long inputSize, long outputSize, bool hasBias = true, Device? device = null, ScalarType? dtype = null) : base(nameof(Linear))
{
this.in_features = inputSize;
this.out_features = outputSize;
weight = torch.empty(outputSize, inputSize, device: device, dtype: dtype).AsParameter();
init.kaiming_uniform_(weight, a: _sqrt5);
if (hasBias) {
bias = torch.empty(outputSize, device: device, dtype: dtype).AsParameter();
var (fanIn, _) = init.CalculateFanInAndFanOut(weight);
var bound = fanIn > 0 ? 1 / Math.Sqrt(fanIn) : 0;
init.uniform_(_bias, -bound, bound);
}
//NOTE: it's important not to call 'RegisterComponents' here.
} and the public Parameter? bias {
get => _bias;
set {
_bias?.Dispose();
_bias = value?.DetachFromDisposeScope() as Parameter;
ConditionallyRegisterParameter(BiasComponentName, _bias);
}
}
public Parameter weight {
get => _weight!;
set {
if (value is null) throw new ArgumentNullException(nameof(weight));
if (value.Handle != _weight?.Handle) {
_weight?.Dispose();
_weight = (value.DetachFromDisposeScope() as Parameter)!;
ConditionallyRegisterParameter(WeightComponentName, _weight);
}
}
} Note that parameters inside modules must not be in a dispose scope. |
I must be missing something. I do not see this on main at all, and I'm looking directly here on GitHub: What am I missing? |
You're right (again :-)). It's on my soon-to-be created PR. Forgot that it wasn't on main. Come back in an hour... Sorry! |
No worries, it happens. I'm on Central timezone and it is near EOD for me, I'll check it out in the morning. Should be able to have a PR for you tomorrow. |
I'll try the 'ReplaceWith' approach and see if that works. Maybe that's all it takes. |
Ok great. I think the only thing missing from what you have is that the OwningDisposeScope is not being cleared on the tensor. So even if the Parameter is detached, the Tensor still is and will get wiped when the scope disposes. Once you do that I think it should work. |
This is what it looks like (and it seems to work: /// <summary>
/// Replaces registration of one tensor with another.
/// </summary>
/// <param name="original">The original tensor, possibly registered under a dispose scope.</param>
/// <param name="replacement">The replacement tensor.</param>
internal static void ReplaceWith(torch.Tensor original, torch.Tensor replacement)
{
DisposeScope? scope = original.OwningDisposeScope;
if (scope != null && scope.Disposables.Remove(original)) {
original.OwningDisposeScope = null;
AddToOther(scope, replacement);
}
} |
See: |
@NiklasGustafsson Looks good. I suggest adding the following test to TestDisposeScopesStatisticsTensor.cs. We don't need anything more comprehensive as this captures the unique behavior. Figured it would be easiest for you if I just pasted it here:
|
Done. |
I think there's a problem handling the statistics for this situation (in Parameter.cs):
I don't have a small repo, but I don't think the stats are collected properly when a parameter takes over the native handle from a non-Parameter tensor and registers itself with its dispose scope.
The text was updated successfully, but these errors were encountered: