Add Session Result class #167

rnett · 2020-12-08T09:23:52Z

Adds Session.Result class, which is a wrapper around the output tensors and the fetched outputs. This allows querying results in the same style as fetch (from name, name and index, Output, and Operand). The Output and Operand methods are also typesafe. It's Iterable and AutoClosable (close closes the result tensors). The run().get(0) usage works just fine still, so even though it's a breaking change, it should be pretty minor.

I also moved the getOutput (from name and from name and index) methods from Session to Graph and made them public.

I'd like to make feed typesafe as well, but was not able to find a good way to do it.

Craigacp

You should remove the AutoCloseableList class from the tests too, as this basically replaces it. You might also want to look at the very similar class I wrote in the ONNX Runtime Java API - https://github.com/microsoft/onnxruntime/blob/master/java/src/main/java/ai/onnxruntime/OrtSession.java#L1091.

Craigacp · 2020-12-09T00:59:38Z

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Graph.java

+   * @return operation in the graph with this name
+   * @see #operation(String)
+   */
+  public GraphOperation operationOrError(String name) {


Why wouldn't this return null or Optional<GraphOperation> rather than throwing? Is it a Kotlin thing?

Kind of, it definitely works better from it, eventually I'd want to mark it @NonNull. I moved the method here from session, where it did throw like that rather than returning null. Since we already had the operation in Graph (which returns null if not found) I added this as operationOrError rather than moving the check to any uses.

Craigacp · 2020-12-09T01:01:58Z

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Graph.java

+      int index = Integer.parseInt(output.substring(colon + 1));
+      return new Output(operationOrError(op), index);
+    } catch (NumberFormatException e) {
+      return new Output(operationOrError(output), 0);


This fallback case seems odd, as it doesn't log anything when it's likely to be programmer error if it's triggered right?

I moved that from Session, I don't think it's too odd? It's less about the actual number and more about if you have an op name like scope:myOp you want to get scope:myOp, not try to interpret myOp as a number.

Logging it would be a good idea, but it doesn't look like there's any logging currently set up. Is there one I can use?

Craigacp · 2020-12-09T01:06:14Z

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Session.java

+    }
+
+    /**
+     * Get the result for {@code output} or throw an {@code IllegalArgumentException} if it wasn't fetched.


Given these get methods are basically Map lookups it's a bit unfriendly to have them throw rather than return null or Optional<> as would be idiomatic for Java.

I'm of two minds here. On one hand, it is a map lookup, so that makes sense. On the other hand, since everything in this map is stuff you passed to fetch, you will most likely only be calling get with things you know exist, and any times you don't will most likely be errors for which throwing makes more sense. I could always add getOrError but that seems like overkill especially w/ the added contains methods. Thoughts?

Also, now that I have a getter for the map, if you want null returns you can just use that.

@Craigacp thoughts on renaming it to fetch (w/ the same behavior as now)?

Craigacp · 2020-12-09T01:07:13Z

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Session.java

-        inputs.add(op.output(index));
-        inputTensors.add(t);
-      }
+      Operation op = graph.operationOrError(operation);


This is a behaviour change as before it used to silently not add anything if the operation didn't exist, now it throws an exception. This should be documented.

It didn't actually change, since operationByName (which is now Graph#operationOrError) would throw if it was going to return null.

Craigacp · 2020-12-09T01:07:56Z

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Session.java

-      if (op != null) {
-        outputs.add(op.output(index));
-      }
+      Operation op = graph.operationOrError(operation);


Doc update for the behaviour change.

Craigacp · 2020-12-09T01:08:15Z

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Session.java

-        targets.add(op);
-      }
+      GraphOperation op = graph.operationOrError(operation);
+      targets.add(op);


Doc update for the behaviour change.

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Session.java

karllessard · 2020-12-09T13:49:26Z

Hi @rnett , sorry for the delay, I've been a bit busy these days, but before I go through this one, I wanted to let you know that I was also planning to add something similar in the tensor type refactoring that is currently in progress, you can take a look at the TensorList and TensorMap utilities in the original draft here: https://github.com/karllessard/tensorflow-java/tree/tensor-ttype/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/util

The main purpose in my case was to support implicit casting of the tensor types in the list itself, hence why Session don't return a List anymore but an object. So we can do:

try (TFloat32 t = s.runner().fetch(x).run().single()) {
    // ...
}

instead of

try (TFloat32 t = (TFloat32)s.runner().fetch(x).run().get(0)) {
    // ...
}

So before I start reviewing, I don't know how critical having this PR merged is for you but we need to take into accounts our future needs if there is an agreement on what they are, either we can start that discussion now or either we wait until the type refactoring to be over, wdyt?

rnett · 2020-12-09T20:58:57Z

Looking at TensorMap/List, I think we'll still want a Result class, for get(Operand/Output) (with type safety + implicit casting) and for immutability. I'd probably use TensorList and TensorMap instead of the List and Map I have currently. What do you think about adding ImmutableTensorMap/List classes? I'm not sure how much it would be used, but for this it would certainly be nice. Generic key types (like for Output) for TensorMap would also be nice.

As for the timing, I think we'd be fine to merge this now and then refactor it a bit with the type system, but it doesn't matter to much either way. I mostly wanted this so I could add proper extensions to the Kotlin API, which is waiting on the type system refactor anyways.

google-cla · 2020-12-28T02:59:44Z

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla · 2020-12-28T03:03:24Z

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

karllessard · 2020-12-28T16:36:24Z

@rnett , I've pushed my unmerged-branch to a new branch in this repository: https://github.com/tensorflow/java/tree/type-refactor

Can you please rebase your work on it and target it in your PR so that we only the changes coming from your branch?

rnett · 2020-12-28T21:49:25Z

@karllessard your branch is a few commits behind master. Are you planning on rebasing it? If so I'll wait for you to do that before doing mine.

karllessard · 2020-12-29T17:23:11Z

Oh sorry @rnett , looks like I've pushed the wrong version of my branch, you should be good now to rebase on it. Refreshing it had also the vicious effect of closing the PR that were targeting that branch, I've reopened this one but if you see other PRs that have been closed by mistake, please reopen them, thanks

google-cla · 2020-12-30T01:58:07Z

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

Craigacp · 2020-12-30T20:09:53Z

@rnett the order swaps because we haven't run down all the non-determinism in the ops generator, so sometimes it emits things in a different order causing spurious changes in PRs.

karllessard · 2021-01-01T16:50:31Z

Thanks for looking at this @rnett , I was eager to see such utility to simplify the manipulation of a group of tensors.

Before going to deep into the analysis of the current proposal, I would like to point out something important to consider. While TF Java has always been graph/session centric, it will gradually move towards to a more functional approach, like @tf.function does in Python. This is what the new ConcreteFunction is partially achieving and the core will continue to build up around it to improve the support of functions as the main API for building and executing graphs.

That being said, I think it is important for the solution we are choosing to manipulate a "bundle" of tensors to be fully compatible with the concept of functions. For example, in functions, tensors are not only bundled as a result of an execution but also to feed the underlying session. Right now everything is handled using HashMap but I'm sure that whatever utility we come up for returning the result of a session run should apply to the feeding as well.

If we do that, then the Result class should live outside Session and be renamed to something else. Again I did not went through the whole proposal yet but before doing it I would like to hear your thoughts about it, thanks!

Craigacp · 2021-01-01T20:50:44Z

@karllessard, I understand the desire for a bundle as inputs, when using ONNXRuntime which has a result object as the output recently I've been missing an equivalent object for inputs to clean those up easily. However for TF I think it might not be ideal. The session result object is helpful because it's an easy way to control the lifetime of a block of tensors which are expected to have the same lifetime, however the input tensors may have different lifetimes (e.g. I frequently have a long lived boolean tensor that sets the network into training mode, and often a long lived one for the current training epoch or learning rate). When the lifetimes of the tensors differ then an AutoCloseable container seems less relevant.

If we wanted to lean into dynamic code generation (which we probably don't due to having to update ASM to track new JDKs every 6 months) then we could make a type safe autocloseable container for the inputs and outputs on load of the ConcreteFunction and then it would expose getters and setters with the right names.

rnett · 2021-01-01T23:16:23Z

@karllessard The Result class is a bit different that a generic tensor container in that it provides keyed and indexed access, as well as Operand/String -> Output mapping, which depends on the Graph instance (it also has the run metadata). If we do end up having tensor containers, I think they would be best used as the backing List/Map in Result, rather than replacing it.

The concrete function API looks neat. Correct me if I'm wrong (it's been a while since I worked with @tf.function), but the goal is essentially to create what looks like a eager function but is actually backed by a graph? I've been playing around with ideas for something similar in Kotlin using compiler plugins (you could use annotations on functions or lambdas), but I'm not sure how you could do the same in Java without ASM generation.

This is probably more of a Kotlin thing than Java, but have you given any thought to having some mechanism for tensor lifetime scopes (like PointerScope)? It seems like most tensors should be lexically scoped (i.e. try with resources), and having some kind of scoping mechanism would make this a lot easier to manage, while still allowing non/globally scoped tensors when needed.

karllessard · 2021-01-02T02:47:48Z

Thanks @Craigacp and @rnett for your good feedbacks. My belief is that if we are about to improve the usability of our API, we should focus more on the functions than on the graph and sessions. @rnett to answer your question, yes, the goal of a ConcreteFunction is to mimic a little bit what Python does, i.e. convert easily a function that can be called eagerly or backed by a graph. Right now, only graph mode is supported by ConcreteFunction but nothing prevents a user to call directly the same method passed as the functionBuilder with an eager session to execute it eagerly. Though I would prefer to make the eager support more explicitly integrated with the function concept. Now should we use an annotation or not, like Python does, I guess it could work but I didn't tried to think how this would fit in the actual design.

Now for the differences in resource management between the inputs and outputs, I was also aware of this detail. For the sake of brainstorming, maybe reference count could be useful here. For example, when we pass a tensor to a bundle, we could just increase the reference count so the tensor gets only released once all references are released. Also, ConcreteFunction has already its way to release or not its resources (the graph and the session) depending on how it has been allocated... I don't have the complete paradigm in mind but we can continue to think about it if we all think that could be something useful, wdyt?

Another point if favor to focus on the functional API is that it worked both with training and inference (after loading a saved model bundle), while using directly the graph and sessions works better only for training since TF2.0, if you remember the issues @Shajan was facing before.

rnett · 2021-01-02T05:54:29Z

I made #181 for the functional API discussion. I don't think any of what we discussed here is blocking this PR?

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

karllessard · 2021-01-03T04:01:21Z

I made #181 for the functional API discussion. I don't think any of what we discussed here is blocking this PR?

I don't want to block the progress of the good work being done here, but I admit I would feel more comfortable adding all these new utilities if I could clearly understand how they would fit in an architecture that promotes the functional API. What about we continue the discussion in #181 a bit further before merging this one?

rnett · 2021-01-03T04:41:49Z

Fine by me, although I think we'll want something similar (and nice to use) for Session regardless of what the functional API looks like. There's always going to be someone who has to run things manually.

rnett · 2021-01-14T01:06:58Z

@karllessard after the discussion and looking further at ConcreteFunction, I don't think we need to wait on this. I do think ConcreteFunction could use a Result class that is essentially TensorMap which throws on missing keys, but that will differ from this enough that there's not much to be shared (the inner map for that will be String -> Tensor, for this it's Output -> Tensor, and we have indices).

If you plan to add TensorMap and TensorList soon I could see waiting for them, however.

karllessard · 2021-01-17T22:17:51Z

Hi @rnett , I prefer to continue keeping that PR on hold for now, as #188 seems to slowly converges towards it but with some redesign. I'm sorry if the PRs are not merged as fast as you would hope but these are addressing important core aspects that requires greater attention and reflection from our end.

rnett · 2021-01-17T22:43:35Z

No worries, I just wanted to make sure that something was indeed blocking it.

Craigacp requested changes Dec 9, 2020

View reviewed changes

rnett force-pushed the rn_session_run_result branch from d7bd18a to a7903cd Compare December 28, 2020 02:59

karllessard changed the base branch from master to type-refactor December 28, 2020 16:34

karllessard changed the base branch from type-refactor to master December 28, 2020 16:34

karllessard mentioned this pull request Dec 28, 2020

Indexing API #166

Merged

rnett changed the base branch from master to type-refactor December 28, 2020 21:27

karllessard closed this Dec 29, 2020

karllessard deleted the branch tensorflow:master December 29, 2020 17:17

karllessard reopened this Dec 29, 2020

rnett changed the base branch from type-refactor to master December 30, 2020 22:58

rnett force-pushed the rn_session_run_result branch from c51e45a to 5a2f602 Compare December 30, 2020 22:59

rnett requested a review from Craigacp December 30, 2020 23:02

rnett added 4 commits January 2, 2021 16:42

move output finding methods to graph, make public

7202ea3

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

add session result class, use in run()

78384d7

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

make Result implement Iterable

e4cf856

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

fix the only direct usage

786e3fc

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

rnett added 15 commits January 2, 2021 16:42

better docs

4500b56

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

even better docs

666f434

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

remove fixed todos

91b63ac

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Move Result out of Runner

6754d4a

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

fix more tests

c96ec0a

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

fix more tests that didn't show up the first time

64b6f2e

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

remove AutoCloseableList

1193412

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Add input size checking and closed checking

e95ae47

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

add contains method, change map structure and add getter

701f7f0

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

add isClosed to tensor, use to prevent double close

0a771b3

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

fold Run into Result

598da62

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Use Collections.unmodifiable*

f67d6e2

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

rebase fixes

22f0a05

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

docs update

58ff935

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

not sure why this order is swapped

3dc6def

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

rnett force-pushed the rn_session_run_result branch from 5a2f602 to 3dc6def Compare January 3, 2021 00:43

karllessard mentioned this pull request Jan 3, 2021

Functional graph definition API #181

Open

karllessard mentioned this pull request Jan 17, 2021

Add TensorScope #188

Closed

Craigacp closed this Sep 19, 2023

Add Session Result class #167

Add Session Result class #167

Uh oh!

Conversation

rnett commented Dec 8, 2020

Uh oh!

Craigacp left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rnett Dec 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rnett Dec 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

karllessard commented Dec 9, 2020

Uh oh!

rnett commented Dec 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

google-cla bot commented Dec 28, 2020

Uh oh!

google-cla bot commented Dec 28, 2020

Uh oh!

karllessard commented Dec 28, 2020

Uh oh!

rnett commented Dec 28, 2020

Uh oh!

karllessard commented Dec 29, 2020

Uh oh!

google-cla bot commented Dec 30, 2020

Uh oh!

Craigacp commented Dec 30, 2020

Uh oh!

karllessard commented Jan 1, 2021

Uh oh!

Craigacp commented Jan 1, 2021

Uh oh!

rnett commented Jan 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

karllessard commented Jan 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rnett commented Jan 2, 2021

Uh oh!

karllessard commented Jan 3, 2021

Uh oh!

rnett commented Jan 3, 2021

Uh oh!

rnett commented Jan 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

karllessard commented Jan 17, 2021

Uh oh!

rnett commented Jan 17, 2021

rnett Dec 9, 2020 •

edited

Loading

rnett Dec 9, 2020 •

edited

Loading

rnett commented Dec 9, 2020 •

edited

Loading

rnett commented Jan 1, 2021 •

edited

Loading

karllessard commented Jan 2, 2021 •

edited

Loading

rnett commented Jan 14, 2021 •

edited

Loading