-
Notifications
You must be signed in to change notification settings - Fork 214
Add TensorScope #188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TensorScope #188
Conversation
Re eager operations: I could include them in |
Also, the CI seems broken. |
How would you go about supporting other resources than "tensors"? |
I'm intentionally not handling anything other than tensors in |
I understand that you're intentionally not doing anything for other resources than "tensors", but there are other resources than tensors. What would you do for those? |
Which other resources? They're being managed just fine now, I wouldn't do anything to change how they are handled, except potentially the eager handles. |
Tensors are the most annoying one, for sure, but there are many more resources that are managed natively by TF, including graphs, sessions, builders... and soon function graphs. It is probably ok to keep the actual |
Yeah, they are all "big" enough and singular enough I don't think manually handling them is a huge issue. Plus as far as I know the memory use of each is much smaller than tensors (or comes mostly from it's tensors), so being on top of closing them isn't such a big issue. |
|
||
if (!localScopeStack.contains(this)) { | ||
throw new IllegalStateException("This scope is not on the scope stack, but was not closed." | ||
+ " This should not be possible."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you allowing scopes only on the stack??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it closed child scopes by closing any scopes in front of it in the queue. If somehow it got removed from the queue, that would silently close all scopes. This shouldn't happen, but did before I started storing the stack as a local variable (if it was closed in a different thread than it was opened in), so this is just for safety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I mean, if localScopeStack
doesn't contain this
, then this test fails. That makes it impossible to use a given instance as a field and/or from some other thread! Doesn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of. localScopeStack
is a local variable, set from the thread local when the scope is created. It's not thread local itself, so this
should always be in it.
It does mean that semantically, the scope is closed on the same thread it was opened on. This makes some sense, because it's only going to collect tensors from that thread anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty confused by what this is trying to do. Is there a clear example of the behaviours it's supposed to enable?
The interaction between the threadlocal stack and the closing seems like it will need careful documentation to ensure correct usage, as it looks like there are plenty of ways to get it wrong.
* they are created in a sub-scope). Tensors may be manually closed earlier without issue. | ||
* <p> | ||
* When auto-attach is true, tensors are automatically tracked on creation. A tensor can me manually added to a scope | ||
* with {@link TensorScope#attach(Tensor)} or {@link Tensor#attachToCurrentScope()}. The tensor will then be closed when the first of it's managing scopes closes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we want Tensors to be closed by the first scope closure rather than the last? If it's attached to multiple scopes then I'd expect it to still be alive if any of the scopes were alive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, is it because of the stacking of scopes? Even though they could be stored in fields and thus mess up the logical stacking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this situation shouldn't happen anyway. It's not clear to me that this was designed to be used in fields, so let's assume we only have references on the stack. If we assume that's true, then any "inner" scopes are supposed to be closed before "outer" scopes. If this doesn't happen, there's probably some logic gone pretty wrong somewhere that this should be the least of our worries...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, exactly, it's not intended to be used in fields, unless it's as a way to ensure that any created tensors are closed when the creator is closed, like in EagerSession. I'm thinking about removing that anyways.
* {@link Tensor#detach()} detaches the tensor from all scopes, requiring the user to close it manually or attach it to | ||
* another scope. | ||
* <p> | ||
* Note that scope management is thread local, except for detach, which will detach even from scopes on other threads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want the thread local scoping? How is it going to interact with Loom?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, Loom is supposed to give us an alternative to TLS:
http://cr.openjdk.java.net/~rpressler/loom/loom/sol1_part2.html#scope-variables
But that's not going to happen anytime soon, so in the mean time we need something... It sounds to me though that if we were to replace TLS with something like that, we would need to juggle 3 types of objects: Tensor
, TensorScope
, and something like TensorScopeScope
... Am I reading this correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scoped
would be perfect for this, I think. Probably rather than a TensorScopeScope
I'd have it add itself to the current scope as a child on creation, or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I assume it's thread local to avoid a bunch of synchronisation, but it's not clear why we'd want the scopes to be thread local. What's the argument for that behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it's a lexical scope (like {}
), representing a block of code or certain segment of execution. It doesn't make any sense to do something like
try(TensorScope scope = new TensorScope()){
myTensorFunction();
}
and have it pick up tensors created by something else going on in another thread.
My latest changes should alleviate the issue somewhat, I'm using InheritableThreadLocal
for the current scope, but the hierarchy can cross threads, so if you create new threads in a scope like above, they will add tensors to the scope, but previously existing threads don't.
} | ||
|
||
localScopeStack = scopeStack.get(); | ||
localScopeStack.push(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if this gets stored in a field then any other scopes allocated by that thread get stuck inside it? What happens if someone uses a FJP or other executor service and has a scope in there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See recent changes and comment above.
I guess I'd thought of a scope as something I could use to pass around a logical grouping of tensors and ensure they are all closed at the same time. For example the output of session.run. This stacking behaviour seems to contradict that as if I store the scope somewhere to keep that collection of tensors for later, then all the other scopes created afterwards get stuck to it, and if I close one of the scopes then any created afterwards would die. That seems extremely counter-intuitive to me. Did I misunderstand how this works? |
The semantics I'm going for here is more of a bounds/guarantee that resources won't leak beyond the scope unless explicitly flagged (via The biggest reason for this, other than the "flag to keep" semantics, it that it leaves resource management entirely up to the end user at the highest level. The user doesn't have to know or care about any tensor creation done by the functions they call. Instead of having to use special collections or implement some scope class, or use TensorScope at each level, the user just uses TensorScope themselves. You could use ordinary collections, custom data classes, or w/e. For example, with result: Session session = new Session(graph);
TFloat32 answer;
try(TensorScope scope = new TensorScope){
Result result = session.runner().run();
answer = result.get(1);
answer.detach();
} Regardless of what outputs are returned by the runner, all of them except The thread interactions definitely need some notes in the docs. I went with thread local since I'm trying to wrap regions of code, and there could be multiple running in parallel on different threads, whose scopes shouldn't interact. I don't know how that would interact with loom, but I don't think it would cause issues? From what I've seen, they are trying to make code using the current thread API port seamlessly. It definitely doesn't work well with Kotlin's coroutines, since they can suspend and resume on different threads, but I can add stuff for that in the Kotlin API. |
How is the |
The only time you get a tensor attached to more than one scope is if you manually add it to a new one, which is mostly when it was detached in the past, or you want to manually give it a shorter lifetime. Remember this is a lexical scope, not a persistent resource owner. The only times you have multiple scopes in play, they will be nested, and an inner scope naturally expires when it's containing scope does. There shouldn't be "overlapping" lifespans.
|
Ok, let me try to ask you this a different way: If we're not supposed to have the same tensor attached to multiple scopes, why try to support that in the first place? |
I can't think of a use case, I'm pretty sure it's allowed because I didn't see a reason to explicitly forbid it. I'll do that now, though, it does make things clearer. |
The reason for guaranteeing that we have only a single known reference to an object is that deallocating it won't have any unintended consequences. |
Ok, I pushed a refactor with a few things:
|
Ok, so I think I understand what you're aiming for a little better. However I don't understand why introducing the notion of scoping with it's implicit magic that checks the valid scopes is better than having an What's the positive case for |
You mean like I've gotten rid of most of the thread local tricks, too, it's a good bit easier to reason about. If you mean something more factory like, the biggest reason I can see off the top of my head is to keep resource control in the hands of the end user. If we use a factory, we have to pass it through everywhere, and if you don't somewhere, there's no way to capture or manipulate those tensors after the fact (assuming you can't edit the API). In theory it's fine, but in practice it's easy to get burnt by someone else's poorly designed APIs. Especially given that a lot of deep learning code is written as part of research, and not necessarily the best quality by software engineering metrics. The implicitness is there anyways from cc @karllessard, we talked about this during the call and you have a better idea of the overall API direction than I do. |
Given that each Tensor should belong to only one scope, what's the point of
keeping track of "parents" and "children"? What's the use case?
|
I'd prefer to have as little implicit behaviour as possible. With this PR any method that returns a I'm not sure if implementing
The fact that lots of deep learning code is written by researchers is probably less relevant to TF-Java as it's not as likely to be used in academia, but I agree that it's important to have well designed APIs that are easy to use. My point about factories was that a |
It's just implementation details, it replaces the queue. It's necessary for the "parents close their children" behavior and for reverting the current scope when a scope is closed. |
I agree with you broadly about explicitness, I just think it makes more sense to require the rare and dangerous case (i.e. a detached tensor) to be made explicit, rather than the default/safe case. The callee can opt out of the automatic management using detach, this just requires them to be explicit about it. And while the scoping is a bit implicit, lexical scoping is well defined and common enough I don't think it's a huge burden. It's one of the reasons I wend with a try-with-resources based API over storing scopes as fields, because the second is very implicit and hard to reason about. And note that the only implicit state changes are almost entirely hidden from the user, and can be checked with I'm not too worried about the extra boilerplate of manual lifetime management, although it is annoying. What I don't like is that it requires the user to be aware of every created tensor (to add them to the collection), or forces them to assume that 3rd party code is correct. It's a lot of mental overhead when reasoning about the code, and it's easy to have one slip through the cracks, so to speak (there's a reason memory bugs are so common in non-GC languages). The automatic attachment is more of a safety net than anything, to prevent the end user from needing to keep track of every tensor. In high memory use cases, they are going to want to manage tensors manually anyways, rather than waiting for the end of the scope.
Yeah, I'm just drawing from the examples I saw in the type refactor code at one point. |
I think the appeal to lexical scoping is missing that the scope tracks through method calls. If I open a scope, then call into a library to convert my input data into tensors, which then calls into another library, then those tensors get attached to the scope. The library developer doesn't know how to manage the tensors they create, as their methods could be called within a scope or outside a scope. Library code shouldn't have its state changed implicitly by things in the calling environment, it breaks the encapsulation.
The automatic attachment is only a safety net if people use it. We can't enforce it, and users can't see how it works in their code because it's implicit meaning it might be refactored around because it uses mechanisms not present elsewhere in Java. Java doesn't have scope variables that implicitly alter the state of enclosed code. As I said, most IDEs and linters will flag |
@rnett :
This is a great summary of the issue we want to tackle here. For sure, the usage of a thread-local stack is very tempting, I also had in mind other cases which would also benefit from it, for example to carry the last instance of On the other side, I agree with @Craigacp that being to implicit in some cases might eventually lead to more issues and confusions for the developers, for all the reasons he pointed out. It seems though that we all agree that we need some sort of tensor autocloseable container, like proposed in #167 . I continue to think though that the solution presented in that PR is too session-centric and should be redesigned to consider all possible use cases for grouping tensors together, i.e.:
Doing this might not remove the need of having some sort of |
I mean, I was asking for a use case, what is the use case you have in mind? The only thing I can see is something like this: try (TensorScope scope1 = new TensorScope()) {
try (TensorScope scope2 = new TensorScope()) {
Result result = session.runner().run();
answer = result.get(1);
answer.releaseToParent();
}
} But we don't need to have TensorScope babysit this for us by tracking everything. The following works just fine by doing it manually, and it's the same number of lines: try (TensorScope scope1 = new TensorScope()) {
try (TensorScope scope2 = new TensorScope()) {
Result result = session.runner().run();
answer = result.get(1);
answer.attach(scope1);
}
} What do you mean by "reverting"? In any case, parents don't need to close their children, because they get closed before anyway. So, I'm asking you again, what's the use case? I don't see it! I'm coming at this from another angle, but allowing only a single scope to own any given tensor makes the use of thread-local storage a bit moot. The code above would work exactly the same, correct? We don't get an implicit attach to scope2, but nothing gets implicitly attached anyway when no scope is found on the stack. To solve both cases, users need to be aware of TensorScope anyway! BTW, the reason I'm using thread-local storage for PointerScope is because JavaCPP needs to map C++ APIs that were designed with a stack in mind, but here we are in control of the API, so we can design something that makes more sense for the Java language. |
Agreed. I need to change the API little to do this, but I want The way I envision it, there's three possible resource management states for tensors you're working with when writing a method: managed by me with a scope, explicitly detached from all scopes, or the caller's responsibility. They roughly map to public TFloat32 runModel(TFloat32 inputA){
try(TensorScope scope = new TensorScope()){
TFloat32 inputB = getInputB();
TFloat32 result = session.runner().feed("a", inputA).feed("b", inputB).fetch("result).run().get(0);
result.release();
return result;
}
} This works the same regardless of whether it's called in a scope or not. The "default" state of the tensors, to to speak, is
I haven't actually seen any AutoClosable warnings in IntelliJ, which now that I think about it is weird, because I would expect to. We could have Tensor and HasTensors not implement AutoClosable, but given Eclipse at least seems like it would only flag these issues as potential resource leaks (see here) I don't think it's a huge issue, although definitely not ideal. The only refactorings that I can think of that would cause issues is moving statements outside of the try block, which I would think would be avoided anyways, because that would change semantics of any code. And yeah, we can't make people use it, but we shouldn't. They are free to manually manage lifetimes if they wish to.
That would be interesting, although we'd have to be careful about mixing execution environments (like with tf.function style graph in eager code). One thing I don't like about that use case is that methods that depend on an Ops instance will fail without it, whereas for TensorScope they wills still work fine regardless of whether it's being called from within a scope or not (which is the issue Adam brought up).
See above, which mitigates some (most?) of it. There's still a bit of implicitness, but it should be transparent to callers and callees.
My issue with the idea of using a "container", if I understand what you're thinking of right, is that you still have to know about all the tensors being used, some of them are just grouped. Sure, it makes it easier, but you still have to juggle them mentally, there's just less of them. R.e. The issue with your second example is that it assumes the parent scope is always in (the language) scope. If you wrapped your inner scope in a method, what you did wouldn't work, which is why I'm saving the parent. What I mean by reseting the current scope is that since I'm using a single variable instead of a scope, when an inner scope closes the outer scope needs to go back to being the current scope. It's like Scope old = current;
block();
current = old; Plus, with the issues Adam brought up regarding encapsulation and transparency, the API shouldn't change depending on if you're in a scope or not, so the "do I have a parent" question needs to be answered internally and hidden from the user. For thread local, imagine something like: ExecutorService e = ...;
try(TensorScope scope1 = new TensorScope()){
e.submit(() -> {
someTensorStuff1();
});
e.submit(() -> {
try(TensorScope scope2 = new TensorScope()){
someTensorStuff2();
}
});
} The launched tasks should execute their code "in" |
That's not actually a feature, that's a bug. We can work around limitations of the Java language using thread-local storage, but altering the state like that is a bug. That's kind of what old-style APIs like OpenGL used to do with a state machine, which they got rid of with Vulkan. There are at least 2 sane ways to go about it:
Ok, but what does that get us, concretely? Why does it matter? If you're worried about having to call attach() manually 1000 times, then let's see what we can do about that. Maybe pass the scope to sessions and let them do the attaching for us? Maybe have the session return a scope/group, in which case maybe we can name it TensorGroup?
I think this works fine without thread-local storage: ExecutorService e = ...;
try(TensorScope scope1 = new TensorScope()){
e.submit(() -> {
someTensorStuff1(scope1);
});
e.submit(() -> {
try(TensorScope scope2 = new TensorScope()){
someTensorStuff2(scope2);
}
});
} What am I missing? |
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
You mean because the weak refs will be dropped but the deallocation won't be done on the JavaCPP side? My thinking is to make the weakness optional, with it being strong by default if noPointerGC is true. Is that ever exposed anywhere? |
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
That's the kind of thing I've been arguing about. If we don't make attach()/detach() a part of the API, we don't need to worry about these things!
If we don't allow detaching, we don't need to worry about that!
Would you still foresee the need to expose anything if we don't allow users to attach/detach stuff?
There's that, but simply keeping weak references to pointers also doesn't work with GC anyway. The JVM needs at least 2 passes to clear weak and phantom references. We can (and will) lose access to weakly referenced pointers before we can get the chance to free them, deterministically, and then free the context they belong to, so memory will leak. We need to free the ops before the context gets freed. There is no way to enforce this ordering of events using GC. I think the original approach that was done for TF 1.x does work correctly tough, see #229 (comment):
|
If we only ever assigned/attached to scopes on creation, we could get rid of these. We'd need a
I'm still not entirely understanding this. You're saying that we may have a tensor get GC'd and then have the scope closed, where the pointer deallocate and the scope closing deallocate go off at the same time because the weak reference isn't removed in time? What are you referring to with "context"? There's nothing that needs to be freed other than tensors. |
For reference counting more generally, the motivating example for me is something like: class KeepTensor(var tensor: Tensor?)
fun main(){
val keep = KeepTensor(null)
TensorScope().use{
keep.tensor = makeTensor(this)
}
// keep.tensor is now closed
} This is very contrived, but the example came from me wondering if class KeepTensor(tensor: Tensor?){
private val scope = TensorScope()
var tensor: Tensor? = tensor
set(value) {
field = value?.let(scope::attach)
}
} While this works in most cases, if we change our example to something like: fun main(){
val keep = KeepTensor(null)
val keep2 = KeepTensor(null)
TensorScope().use{
Tensor temp = makeTensor(this)
keep.tensor = temp
keep2.tensor = temp
}
} Since we only allow attaching to one scope, the tensor will only be attached to Implementation wise, if we want to generify it to |
I'd also like to add a "weak" version of the resource counter scope that doesn't increment the count, just tries to free things (so if another scope uses and then discards them, they will be discarded then). This would be useful in places like |
If the
No, I'm saying that an object may get deallocated after the scope is closed. So, you will have live objects even after the scope is closed. You could argue that this isn't a big issue for tensors that have no context, but it renders the use of "scope" moot. If we can't even guarantee when everything is going to get deallocated we might as well have no scope at all! Moreover, in the case of tensors and ops that do have a context, it is a big deal:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/c/eager/c_api.h#L92
The only thing that needs reference counting in there are the underlying try (Scope outerScope = Scope.create()) {
Tensor t;
try (EagerSession s = EagerSession.create()) {
EagerOperation add =
opBuilder(session, "Add", "ReturnResult")
.addInput(tf.constant(2).asOutput())
.addInput(tf.constant(4).asOutput())
.build();
t = add.tensor(0).newReference(outerScope);
}
} Precisely what do you feel would be missing here?
You mean forced deallocation like a |
It would work fine for just tensors, but doesn't work well for more generic Resources for the reasons I outlined in one of the last few comments (either creation would need to be handled in the scope, or attaching in the object, and it leaves resource counting to the user, which isn't an issue for Pointers but could be with other things). Requiring all resources have a
You're saying that it's possible for a tensor to be marked weak (and removed from the set), but added to the deallotator reference key after the context, even though the context is still alive when the scope is closed? So you're suggesting we synchronizing the deallocator and making it do nothing if the object is already deallocated, and then store the deallocator and call them before session/scope close (so the objects can still be GC'd, but we are guaranteed all deallocators fire). That said, this is a non issue because the docs lie. I went and tested it, and this: val s = EagerSession.create()
val tf = Ops.create(s)
val a: Operand<TInt32> = tf.math.add(tf.constant(1), tf.constant(2))
TFE_DeleteContext(s.nativeHandle)
println(a.asTensor().getInt()) works fine, it even prints correctly. Trying to create new ops after deletion does fail, but that's as intended. In general, I think scopes are fine to not be strict about this. You may have a resource close slightly after the scope, and we can note that in the docs, but requiring every resource to define a non-self-referencing deallocator is a lot of overhead when there aren't any situations that require it (eager context would be one if the above test didn't work). Resources won't leak indefinitely, there would just be a slight delay (if I'm understanding things right). If there is a need for strict resource -> context deallocation ordering it should be handled by the context. @karllessard @Craigacp are you ok with this (specifically not being strict on ordering)?
This works fine if all resources use Pointer internally, but that's not necessarily always the case. In particular, variables would be composite resources that hold onto a bunch of operands, and would have an internal scope to keep them alive, but would also themselves be Resources. Functions would be similar, holding on to multiple ConcreteFunctions.
Nah, I mean something that tries to close everything attached to it if a) it has no references and b) isn't already closed, but doesn't increment the count itself. For cases like: try(ResourceScope scope = new WeakResourceScope(){
Tensor a = newTensor(scope);
Tensor b = newTensor(scope);
Composite c = new Composite(a, b);
// stuff
c.close()
} where |
I'm not sure I see the issue, it feels to me like a pretty general way of doing things. It's just reference counting! It's what they do in the Python API. It works fine there, there's no reason we can't do everything the same way in Java. Maybe some things wouldn't be pretty, I don't know, but there's no reason it just wouldn't work. Would you have an concrete example of what you would like to do but that wouldn't work?
Yes, that's what the code in TF 1.x was doing. Why do you guys keep pushing back against what was being done in TF 1.x? I mean, I know it's convoluted, but at least it kind of sort of worked and did what you want to do. (I still think it's a bad idea in general, so I'm not going to care personally about this.)
Why do you want to play with fire when there's a perfectly valid way to work around this? And yes, resources can leak indefinitely, see, for example, https://blogs.oracle.com/javamagazine/epsilon-the-jdks-do-nothing-garbage-collector. That's the kind of thing that people using DJL are talking about. Like I already told @Craigacp, if that's not the kind of application that interests people here, they'll probably have to move away from TF Java and I'll help them get whatever they need with JavaCPP directly.
Ok, so we should probably flesh this out a bit to figure out exactly what we want the thing to do. I don't see why having a
That sounds just like an implicit |
It would work, it's not impossible to do, but it requires a) every resource implementing a copy method, and b) the constructor and copy method must accept a scope and attach to the scope. That's not something I want to force
If you're using a "Do nothing GC" but still relying on JavaCPP's GC cleanup, I don't think that's our issue. This would work fine with Epsilon as long as JavaCPP's We could also make registering a deallocator optional, and use it for Pointer based classes, I just don't want to require it for every resource when it's almost always a non issue and is a decent amount of overhead to implement (and prone to confusing and hard to debug bugs if you keep the resource reference around).
I'm not proposing a detach method. This has nothing to do with
Again, detach doesn't have anything to do with this. This is analogous to the default scope mentioned in that PR, but instead of a single default it's one that can be used by our existing top level scopes. It still "owns" resources in the sense that it will close them when it closes, but doesn't prevent them being closed earlier if they are part of some other earlier-closing scope. |
As per the above comment, I'm closing this PR, we need a non-tensor specific resource scope. I don't have time to do it at the moment, but may eventually. |
Adds a
TensorScope
class for managing native tensor resources. Behavior is as described in the class javadoc:It manages
RawTensor
instances instead of managingTF_Tensor
directly to allow for multipleRawTensor
s that reference the same native handle. While this is an edge case, there's no real cost to handling it.Tensors created from eager operations will be closed when the eager session closes like they are now, but can be detached.
It does not manage eager operations or NDArrays in any way.
I also added
HasTensors
to allow for easy resource management of tensor groups/collections (i.e. detach the whole list or object at once).TensorList
,TensorMap
and similar classes should implement this.