[WIP] tf.Variable like API for variables #179

rnett · 2020-12-31T03:39:52Z

Fixes #170.

Adds a tf.Variable like class, using the new resource API (see here and here). It's compatible with Eager and Graph mode, and will work better with eager gradient tapes in the future.

A couple of points I'd like feedback on:

Immutability by default. Most of the time variables are only updated by the optimizer, not by any code users will write, so I think this makes sense. It's also not hard to access the mutable version (asMutableVariable()).
Auto-scoping operations to the creation scope. This is the value(), create(), etc. methods that don't take a Scope parameter and use the scope the variable is created with. While these are certainly much nicer to use, they do muddle the semantics a bit, and if gradient tapes are done via Scope it would probably cause issues (since the tapes need to be notified of variable usage, afaik). I originally did it this way because shape() depends on value() which needs a scope, although it is possible to just return the variable's initial shape.

Craigacp

I don't think splitting out mutability is a good choice here. The variable can be mutated under the covers by the TF runtime (and will be, because that's what it's for), and the documentation here isn't clear that this is the case. Also an immutable variable sounds a lot like constant and we've already got those, which these docs don't reference. I'm not clear what's gained by making the split given the only implementation is mutable, and the immutable view has a method that returns the mutable version. Could you give me more of an idea why having an unmodifiable view be the default is worthwhile?

I think using the reference variables is fine and we should probably migrate, but there are a bunch of things in the framework which gloss over the existence of such variables and we'll need to migrate those (as when I ported over the optimizers from Python I ignored all the reference ops and use the regular ones, but presumably we'll need to switch to using the reference ops).

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/ExecutionEnvironment.java

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/variable/Variable.java

Craigacp · 2020-12-31T04:02:22Z

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/variable/MutableVariable.java

+      }
+      cachedRead = ReadVariableOp.create(scope, handle, tType);
+    }
+    return cachedRead;


This method has data races. In Python they don't need to worry about it, but we should at least consider what the possible behaviours are in Java.

Yeah, I should use a local variable there. Do you think using AtomicReference as well is worth it? I'm not sure how much of the rest is thread safe.

rnett · 2020-12-31T04:24:20Z

isValueInitialized(), which uses IsVariableInitialized, requires a ref tensor, is there any way to get this in the Java API yet?

If not, I'll have to remove the method.

rnett · 2020-12-31T04:34:54Z

The immutable (more properly read only) version is somewhat similar to Kotlin's collections. It's always backed by a mutable version, and may be casted and mutated in most cases, but the mutation APIs are hidden. So not it's immutable, just a read only view of the mutable variable, semantically. I do need to update the docs a bit to reflect that. The idea was since that generally users assigning to a variable will be an error, it should be harder to do and be made more explicit.

rnett · 2020-12-31T04:39:16Z

Optimizers definitely will need a pass, I originally didn't do it as part of this PR as I wasn't going to use the resource variables, but now that I am I probably should. I'd like to make it a bit easier to pass in the list of variables to update on rather than using the entire graph, too, for multi-model support (i.e. GANs).

One other thing that came up: I can't use initialization as a control dependency since it would be re-ran each time. Is there a way to require initialization to have been ran in the session? You will get an uninitialized error if you don't run it, but I'd like a clearer way (or ideally to run it automatically, like a control dependency, just only once per session).

rnett · 2021-01-03T02:02:22Z

Currently somewhat blocked by tensorflow/tensorflow#46114, although if/when we add gradient registry ops we could work around it.

Craigacp · 2021-01-03T02:35:07Z

The immutable (more properly read only) version is somewhat similar to Kotlin's collections. It's always backed by a mutable version, and may be casted and mutated in most cases, but the mutation APIs are hidden. So not it's immutable, just a read only view of the mutable variable, semantically. I do need to update the docs a bit to reflect that. The idea was since that generally users assigning to a variable will be an error, it should be harder to do and be made more explicit.

Sure, but that's not how Java's collections work and I think it might be preferable not to leak Kotlin-isms into the Java API used by all the JVM languages. Doesn't python expose these operations to allow updates to epoch numbers, learning rates and similar? I think those would be common use cases we'd have in Java too.

Currently somewhat blocked by tensorflow/tensorflow#46114, although if/when we add gradient registry ops we could work around it.

The lack of this gradient (which presumably just passes straight through) means it's not possible to train any models which contain this variable right?

rnett · 2021-01-03T04:46:30Z

The lack of this gradient (which presumably just passes straight through) means it's not possible to train any models which contain this variable right?

Yeah, and it is just a passthrough.

Doesn't python expose these operations to allow updates to epoch numbers, learning rates and similar? I think those would be common use cases we'd have in Java too.

Yeah, true, I'll refactor that. If/when proper logging is added it would be good to have a warning when variables are mutated when they aren't "supposed" to be (i.e. in the forward pass).

Craigacp · 2021-01-15T04:56:51Z

The lack of this gradient (which presumably just passes straight through) means it's not possible to train any models which contain this variable right?

Yeah, and it is just a passthrough.

Ok, well this needs to go on hold till they fix that upstream then.

Doesn't python expose these operations to allow updates to epoch numbers, learning rates and similar? I think those would be common use cases we'd have in Java too.

Yeah, true, I'll refactor that. If/when proper logging is added it would be good to have a warning when variables are mutated when they aren't "supposed" to be (i.e. in the forward pass).

How would we identify what's a forward pass vs a backward pass? Privileging the target names seems restrictive.

Craigacp · 2021-01-15T04:58:49Z

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/variable/Variable.java

+
+  private final Shape shape;
+  private final DataType dataType;
+  private final boolean trainable;


Shouldn't this be mutable? In Keras people freeze and unfreeze layers all the time, so flipping this back and forth seems reasonable.

It's not in python, so I copied that (see here). It doesn't seem that hard to do, but we'd want to have something to ensure it's not changed while a gradient tape is active in eager mode, or have some special handling for that. It depends on what that API looks like though.

I don't know how keras handles it. Maybe with the StopGradient op? It seems like they have their own trainable and non-trainable weight management outside of this.

Ah ok, so the python docs make sense. The notes on the behaviour of trainable in saved model means it's still counter-intuitive as I think I'd like to store the epoch count in a checkpoint, it makes it easier to restart training, but oh well.

It would be nice to add the trainable variables to a hook in the graph the same way they do in Python. The registerVariable hook could check if it's trainable.

I could, yeah. I figured it was trivial to filter the variables list. I would think any optimizer implementations would want to take a list of variables from whatever types of models they use anyways, to allow for multiple models in the same graph.

BTW, the Keras Model collects the "trainable" variables and passes them to the optimize function.

tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/variable/Variable.java

rnett · 2021-01-15T05:20:11Z

Ok, well this needs to go on hold till they fix that upstream then.

Yeah. I made a PR following the C++ gradient instructions, but it doesn't work. I'm going to follow up in the mailing list but haven't had time yet.

How would we identify what's a forward pass vs a backward pass? Privileging the target names seems restrictive.

Yeah, I don't know how feasible that is. It would probably have to be a framework level setting which we shouldn't mix into core. For eager mode, it's easy enough to check if any gradient tapes are active, but I don't know of a way to do that in graph mode.

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

…sion) Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

…o match python Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

… fixed in tensorflow#237 PR. Signed-off-by: Ryan Nett <rnett@calpoly.edu>

rnett · 2021-03-12T20:55:14Z

This will change fairly significantly due to #237 (and parts are currently broken), so don't merge it. However, with tensorflow/tensorflow#46115, the gradient works.

JimClarke5 · 2021-05-06T20:42:00Z

In Metrics and Model, several variables are updated outside of Optimizers.

JimClarke5 · 2021-05-06T20:47:28Z

callbacks.LearningRateScheduler requires a way to update the learningRate in an Optimizer. The original thought was to use Placeholders for this, but if we could use a Variable we wouldn't need to do funny stuff with feeds and changing Tensor values.

Craigacp reviewed Dec 31, 2020

View reviewed changes

rnett requested a review from Craigacp January 3, 2021 02:01

Craigacp reviewed Jan 15, 2021

View reviewed changes

rnett mentioned this pull request Feb 3, 2021

Functional API: Execution environment agnostic function #205

Open

rnett mentioned this pull request Mar 11, 2021

Initialization and Variables #237

Open

rnett added 16 commits March 12, 2021 12:12

Start of tf.Variable style variable class

e4cb2fb

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Javadoc, change initialize to no-op if already done (for function ver…

52c23fe

…sion) Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Change to immutable by default

3c75f27

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

todo

c5ad2b3

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

new type system

19e6f3c

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Resource API version

d4336de

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

forgot a not

3fc2613

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

forgot a not

8b789d9

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

don't add init in eager mode

7e2dc39

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Add handle getter, fix data races

4f8c541

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

try-with-resources in test

2ae55ab

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

typo fix

49e1547

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

make assignAdd and assignSub have control deps on any cached reads, t…

1309ee5

…o match python Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Add gradient test, ignore and note C++ issue

8d3ea0c

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Add variable options, with trainable (not used yet)

3b0a6f5

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

Merge Variable and MutableVariable, un-deprecate registerVariable

6294f81

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

rnett force-pushed the rn_variable branch from 053dafd to 6294f81 Compare March 12, 2021 20:14

Some test fixes - gradients work. Control deps are broken and will be…

342744f

… fixed in tensorflow#237 PR. Signed-off-by: Ryan Nett <rnett@calpoly.edu>

JimClarke5 mentioned this pull request Jun 14, 2021

Metrics Add init(Ops) method #337

Closed

rnett closed this Sep 19, 2021

rnett deleted the rn_variable branch September 19, 2021 00:58

[WIP] tf.Variable like API for variables #179

[WIP] tf.Variable like API for variables #179

Uh oh!

Conversation

rnett commented Dec 31, 2020

Uh oh!

Craigacp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Craigacp Dec 31, 2020

Choose a reason for hiding this comment

Uh oh!

rnett Dec 31, 2020

Choose a reason for hiding this comment

Uh oh!

rnett commented Dec 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rnett commented Dec 31, 2020

Uh oh!

rnett commented Dec 31, 2020

Uh oh!

rnett commented Jan 3, 2021

Uh oh!

Craigacp commented Jan 3, 2021

Uh oh!

rnett commented Jan 3, 2021

Uh oh!

Craigacp commented Jan 15, 2021

Uh oh!

Craigacp Jan 15, 2021

Choose a reason for hiding this comment

Uh oh!

rnett Jan 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Craigacp Jan 15, 2021

Choose a reason for hiding this comment

Uh oh!

Craigacp Jan 15, 2021

Choose a reason for hiding this comment

Uh oh!

rnett Jan 15, 2021

Choose a reason for hiding this comment

Uh oh!

JimClarke5 May 6, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rnett commented Jan 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rnett commented Mar 12, 2021

Uh oh!

JimClarke5 commented May 6, 2021

Uh oh!

JimClarke5 commented May 6, 2021

Uh oh!

Uh oh!

rnett commented Dec 31, 2020 •

edited

Loading

rnett Jan 15, 2021 •

edited

Loading

rnett commented Jan 15, 2021 •

edited

Loading