Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialization imprvements #178

Merged
merged 7 commits into from
Jan 7, 2021

Conversation

rnett
Copy link
Contributor

@rnett rnett commented Dec 29, 2020

Makes three simple improvements to initialization:

  • Makes Ops.initAdd no-op in eager mode, since the op has already been ran.
  • Adds a runInit() method to Session that runs the graph's initializers.
  • Adds a doInitialization() method to Session.Runner to add the initializers as targets.

There's no changes that conflict w/ the type refactor, so I'll just wait till it's merged and rebase onto master.

Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Signed-off-by: Ryan Nett <rnett@calpoly.edu>
@karllessard
Copy link
Collaborator

I remember that we we've added the Init ops, we discussed about having an specialized endpoint like this to initialize the variables but then ended up deciding that we should have something more generic that allows you to run any single op in a session.

try (Graph g = new Graph()) {
  Variable<TInt32> x = tf.variable(tf.constant(10));
  Variable<TInt32> y = tf.variable(tf.constant(20));

  try (Session s = new Session(g)) {
    s.run(tf.init());
    ...
   }
}

Just double checking, is there any specific reason to go back to a specialized endpoint for variable initialization only? @Craigacp , any comment on this?

@rnett
Copy link
Contributor Author

rnett commented Jan 1, 2021

Just double checking, is there any specific reason to go back to a specialized endpoint for variable initialization only?

Two reasons: discoverability that initialization is a thing that needs to be ran (i.e. if you depend on a network that uses init that you didn't write, you may not even know it exists. I got burnt by forgetting to run it a few times), and doing s.run(tf.init()) creates a new init op each time, where runInit() just runs each initialization op w/o creating anything.

Signed-off-by: Ryan Nett <rnett@calpoly.edu>
Copy link
Collaborator

@Craigacp Craigacp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm fine with adding this back in, the previous discussion about it happened here - #36 (comment)

@Craigacp
Copy link
Collaborator

Craigacp commented Jan 3, 2021

Just double checking, is there any specific reason to go back to a specialized endpoint for variable initialization only?

Two reasons: discoverability that initialization is a thing that needs to be ran (i.e. if you depend on a network that uses init that you didn't write, you may not even know it exists. I got burnt by forgetting to run it a few times), and doing s.run(tf.init()) creates a new init op each time, where runInit() just runs each initialization op w/o creating anything.

All TF 1 networks require the init op to be executed, it's been dropped with TF 2 in Python, but that's because the variables changed and things are implicitly eager whereas we still have Graph & Eager modes which are quite different. Until we paper over that more (and it might not be possible to do so without more improvements in the C API) then TF Java will still always need the init operator.

I guess that means we should document that initialization is still required somewhere.

@karllessard
Copy link
Collaborator

Again, if we focus on the functional API instead, we can probably call session.run(tf.init()) implicitly when function.call is invoked, or something like that?

@rnett
Copy link
Contributor Author

rnett commented Jan 3, 2021

Depends on how you store the session in the ConcreteFunction, but yeah probably.

Idk about Python TF2, but I'm still using it for variable initialization in my PR for variables, and I don't see a way around it. Python uses init_scope() (as a context manager) which I think is the same or similar, I don't know if we have an equivilant.

Signed-off-by: Ryan Nett <rnett@calpoly.edu>
@karllessard
Copy link
Collaborator

Thanks @rnett , I like this simplified version and I personally like to make it easier to initialize the variables. I've left a few minor comments, if you want to take a look before I merge.

Still to be discuss, I would like to see at some point the Functional API taking care of initializing the variables once before calling the graph without the need for the user to do it.

karllessard
karllessard previously approved these changes Jan 7, 2021
@rnett
Copy link
Contributor Author

rnett commented Jan 7, 2021

I'll change those.

I agree about variable initialization, we could have the session track whether it's initialized, and run it on the first run call if not? We'd probably want to prevent the manual usage of the init op then.

Signed-off-by: Ryan Nett <rnett@calpoly.edu>
@karllessard karllessard merged commit 0fc54c8 into tensorflow:master Jan 7, 2021
JimClarke5 pushed a commit to JimClarke5/java that referenced this pull request Jan 30, 2021
* No-op on initAdd in eager mode

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* runInit() method in session

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* add doInitialization() to Runner

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* fix javadoc

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* assume only graph or eager environments

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Remove doInit(), update javadocs

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* small fixes

Signed-off-by: Ryan Nett <rnett@calpoly.edu>
karllessard pushed a commit that referenced this pull request Feb 1, 2021
* Initial checkin

* Initial checkin and sync with master

* Initial checkin and sync with master

* JavaDoc cleanup

* Javadoc fixes

* Change LossInterface to LossMetric.
Fix JavaDoc,
modify one line code block to include braces.

* Removed hashmap for variables, they are not needed as the variables only live within a single instance of a Metric.

* reformat code

* Add tests for assertBroadcastable

* Change type to resultType

* Added V data type for sampleWeights so that it is not forced to be the same type as the return or internal variables,

* change 'type' to 'resultType'

* clean up mean and fix assert assertBroadcastable

* fix error message

* Change sampleWeights to have its own generic type <S extends TNumber>

* Add commment about invalid tests expecting IllegalArgumentExceptions

* Add this exception instead of the more generic IllegalArgumentException when static shapes cannot boradcast.

* change IllegalArgumentException to NotBroadcastableException.
change hasValidNonscalarShape to canBroadcastNonscalarShapes
change hasValidNonscalarShape to canBroadcastNonscalarShapes

* reformat code

* Fis=x Javadoc
move the dynamic shapes and rank down to the dynamic section so they are created needlessly when static
Fix if statement to check for unknown size and unknown dimensions

* Fix Reduce to use boradcastWeights,
renamed WeightBroadcastTest to AssertBroadcastableTest and added BroadcastWeightsTest

* Added comment to count to indicate that it may be weighted.

* Added SetsOps and fixed AssertBroadcastable to use SetsOps methods,

* Fixed based on various PR comments.

* Deleted, no longer needed after change to Variable handling in Metrics.

* Nicer error messages for mode-forbidden ops (#169)

* start fobbiden ops checks

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* fix style

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* move checks to builder method

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Initialization imprvements (#178)

* No-op on initAdd in eager mode

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* runInit() method in session

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* add doInitialization() to Runner

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* fix javadoc

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* assume only graph or eager environments

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Remove doInit(), update javadocs

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* small fixes

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Clairify tensorOf lifetime requirements (#190)

* Clairify tensorOf lifetime requirements

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Do codegen

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Remove extra generics from op generation (#193)

* Successfully remove extra type params, but it broke javadoc generation

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Generate covariant types

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Do generation

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Update help text.

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Fixes

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Add Java 11 support - Initial Phase (#185)

* Add profile for JDK11 and  Automatic-Module-Name to jars

* add maven.compiler.release=11

* Update manual ops for new codegen (#196)

Signed-off-by: Ryan Nett <rnett@calpoly.edu>

* Fix Losses to use CHANNELS_FIRST/LAST for CategoricalCrossentropy

* Fix SetOps to properly convert sparse tensor to dense tensor using tf.sparse.sparseToDense with the output of tf.sparse.denseToDenseSetOperation

* Initial checkin

* Initial checkin and sync with master

* Initial checkin and sync with master

* JavaDoc cleanup

* Javadoc fixes

* Change LossInterface to LossMetric.
Fix JavaDoc,
modify one line code block to include braces.

* Removed hashmap for variables, they are not needed as the variables only live within a single instance of a Metric.

* reformat code

* Add tests for assertBroadcastable

* Change type to resultType

* Added V data type for sampleWeights so that it is not forced to be the same type as the return or internal variables,

* change 'type' to 'resultType'

* clean up mean and fix assert assertBroadcastable

* fix error message

* Change sampleWeights to have its own generic type <S extends TNumber>

* Add commment about invalid tests expecting IllegalArgumentExceptions

* Add this exception instead of the more generic IllegalArgumentException when static shapes cannot boradcast.

* change IllegalArgumentException to NotBroadcastableException.
change hasValidNonscalarShape to canBroadcastNonscalarShapes
change hasValidNonscalarShape to canBroadcastNonscalarShapes

* reformat code

* Fis=x Javadoc
move the dynamic shapes and rank down to the dynamic section so they are created needlessly when static
Fix if statement to check for unknown size and unknown dimensions

* Fix Reduce to use boradcastWeights,
renamed WeightBroadcastTest to AssertBroadcastableTest and added BroadcastWeightsTest

* Added comment to count to indicate that it may be weighted.

* Added SetsOps and fixed AssertBroadcastable to use SetsOps methods,

* Fixed based on various PR comments.

* Deleted, no longer needed after change to Variable handling in Metrics.

* Fix Losses to use CHANNELS_FIRST/LAST for CategoricalCrossentropy

* Fix SetOps to properly convert sparse tensor to dense tensor using tf.sparse.sparseToDense with the output of tf.sparse.denseToDenseSetOperation

Co-authored-by: Ryan Nett <rnett@calpoly.edu>
@rnett rnett deleted the rn_initalize_helpers branch October 17, 2021 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants