RFC: Functions, not sessions in 2.0 #20

asimshankar · 2018-09-20T18:03:57Z

Review period closes 2018-10-04

Functions, not sessions in 2.0

Status	Proposed
Author(s)	ashankar@google.com, joshl@google.com
Sponsor	apassos@google.com
Updated	2018-09-18

Proposal to make TensorFlow be more "Pythonic" in 2.0. In four bullet points:

Encourage the encapsulation of graph computation as Python functions
(where the graph is executed when the function is invoked, instead of via Session)
Align "state" in the TensorFlow runtime (e.g., resource tensors like those that back tf.Variable objects) with state in the Python program (e.g., Python objects corresponding to the runtime state with lifetimes attached to each other).
Make it easy to export these encapsulations to a GraphDef+Checkpoint and/or SavedModel.
Enable eager execution by default.

rfcs/20180918-functions-not-sessions-20.md

ngc92 · 2018-09-20T21:50:19Z

For python 3 users, it would be nice if instead of specifying expected types and shapes as an argument to defun, pythons own type annotations would be parsed. So instead of

@tf.defun(input_signature=((tf.float32, [None, 5]))
def f(x):
  pass

we could write

@tf.defun
def f(x: tf.TensorSpec[tf.float32, (None, 5)):
  pass

While this could not be used in tensorflow itself (unless there was a build step that when targeting py2 would convert the latter into the former), but would give projects not concerned with backward compatibility a nicer syntax, and give linters the ability to detect some API misuses before the program is run.

rfcs/20180918-functions-not-sessions-20.md

carlthome · 2018-09-21T07:18:37Z

Definitely prefer tf.function over tf.defun! Why would we de-fun TensorFlow programs? 🙃

omoindrot · 2018-09-21T09:01:41Z

It also makes sense to have @tf.function if we have @tf.method (and not @tf.defmethod).

seanpmorgan · 2018-09-21T14:21:22Z

Strongly agree with @tf.function or @tf.func. My first look at tf.contrib.defun was that it signaled some kind of defunct function. Since the decorator will be so commonly used, it's worth making sure the name is intuitive at first glance and doesn't need to be looked up.

Looking at common python decorators, I find the most clear decorators use either:

an understood noun (@classmethod, @propery, @app_endpoint())
written out action that wraps it (@autograph.convert(), @login_required, @atexit.register )

seanpmorgan · 2018-09-21T14:29:00Z

Separately, will we need to handle tf.defun wrapping a function which uses tf.py_func? Seems like if nothing else it may need to throw an error since it wont be serialized in the graph.

Related to tensorflow/tensorflow#10282

ngc92 · 2018-09-21T22:27:29Z

How does this interact with device selection? I.e.

@tf.defun
def f(...):
    ...

with tf.device("/gpu:0"):
  f()
with tf.device("/gpu:1"):
  f()

the different f's have the same signature here, but to execute them on different devices we need to different graphs.

samjabrahams · 2018-09-22T11:10:09Z

At a high level, I really like the idea to push towards a more functional/composable API. For me, having to figure out a clean way to reuse components of a graph while passing in different inputs (e.g. dataset inputs during training, then export a graph with placeholders for inference) has always been an annoyance. I will likely have more comments (currently traveling and have limited time to type things up), but I had a couple of concerns I want to bring up:

Currently, the Session API takes a few runs to "warm up". That is, the first couple of runs will run slower. I suppose this is more of a question about Eager in general, but are there risks of things always running slowly by not using a session? Or will there actually be less overhead due to not needing to traverse a graph to find what operations to run?
I'm concerned about this change being a half-measure. I haven't done user research myself, but I think that the difficulty of new user on-boarding is more than just the fact that the graph is run lazily. It's also that the API is massive and thus prone to make new user's eyes glaze over. Not only that, but there is limited guidance on what idomatic TensorFlow code should use. This leads to confusion and a lot of people who have to learn the same lessons over and over. I believe that one of the largest opportunities of 2.0 is the ability to remove cruft and imagine what TensorFloe could be if it was built from the ground up. I understand that there are needs in terms of enabling backwards compatibility to prevent alienating the userbase, but right now I think this risks adding bloat in the form of "there's yet another way to run a graph!".
- I'm not saying that this API is exactly what we want; I need more time to digest it. What I would say is that either the scope should be much larger, or it should be reduced so that it doesn't require as many awkward compromises/workarounds in order to kind-of work with existing semantics.
I think there should be more weight given to/conversation about diverging the Python vs non-Python APIs. I know that most of the people using C++/Go APIs may be considered "power users", and thus better equipped to deal with additional things to learn, but one of the strengths of TensorFlow is that the semantics between your Python/training code are similar to those in deployment. It's hard to argue against "Python code should be Pythonic", but I wonder if there is a way to keep the APIs unified at the same time.

asimshankar · 2018-09-26T00:31:27Z

@samjabrahams , thanks for the comments. Some responses:

"warm up": There should be no risk of running slowly by not using a session. Any warmup (e.g., creating the kernels etc.) may happen on the first time the function is executed, but there should be no overhead beyond that. As you guessed, there may even be less overhead (the ~10us or so sometimes spent in looking up the exact graph to execute based on feeds and fetches).
This change is just one of many towards 2.0. You'll see talk about reducing the API surface in other RFCs like RFC: Unify RNN interface #15 RFC: TensorFlow API symbols and namespaces #16 RFC: Sunset tf.contrib #18, and others coming. Is there something specific to this proposal that you think can be improved?
Point taken about the other language APIs. I'll follow up on that.

asimshankar · 2018-09-26T00:32:20Z

@ngc92 , re #20 (comment) - yes, different graphs will be created for those. This happens with tf.contrib.eager.defun today, as the context of execution is implicitly a part of the input signature. I'll clarify that in the document too.

(Prototyped in tensorflow/tensorflow@84ace03)

rfcs/20180918-functions-not-sessions-20.md

(In the spirit of tensorflow/community#20) PiperOrigin-RevId: 217627136

goldiegadde · 2018-10-30T21:25:07Z

asimshankar could you please post the notes from the design review meeting into this thread ?

also, are there any updates to this document?

once both those are resolved we can proceed to merge this.

asimshankar · 2018-11-16T02:51:59Z

@goldiegadde : I updated the document in place based on the meeting notes.

goldiegadde · 2018-11-16T16:31:53Z

@goldiegadde : I updated the document in place based on the meeting notes.

Thanks @asimshankar . Can you push a version of the doc with Status as "Accepted". We can then proceed to merge it.

asimshankar · 2018-11-16T17:33:42Z

@goldiegadde : Done. Thanks.

goldiegadde · 2018-11-16T17:54:26Z

Thanks everyone for reviewing the RFC. I am going to merge this request now.

asimshankar added 3 commits September 19, 2018 19:06

RFC: Functions not Sessions in TensorFlow 2.0

9645a12

Formatting tweaks

6f265d6

Formatting tweaks

7b7d8cb

asimshankar requested review from ewilderj and martinwicke as code owners September 20, 2018 18:03

ewilderj added RFC: Proposed RFC Design Document 2.0 TensorFlow 2.0 development labels Sep 20, 2018

Fix some links

72a8314

ageron reviewed Sep 20, 2018

View reviewed changes

rfcs/20180918-functions-not-sessions-20.md Outdated Show resolved Hide resolved

ewilderj reviewed Sep 20, 2018

View reviewed changes

rfcs/20180918-functions-not-sessions-20.md Show resolved Hide resolved

Fix the "Status" column

2b712e5

ageron reviewed Sep 20, 2018

View reviewed changes

rfcs/20180918-functions-not-sessions-20.md Show resolved Hide resolved

ageron reviewed Sep 20, 2018

View reviewed changes

rfcs/20180918-functions-not-sessions-20.md Show resolved Hide resolved

ageron reviewed Sep 20, 2018

View reviewed changes

rfcs/20180918-functions-not-sessions-20.md Show resolved Hide resolved

asimshankar added 2 commits September 20, 2018 17:06

Incorporate some suggestions.

bad3fdb

Shorten the autograph example

10f4e62

ppwwyyxx reviewed Sep 21, 2018

View reviewed changes

rfcs/20180918-functions-not-sessions-20.md Show resolved Hide resolved

Correct autograph link.

161decb

asimshankar mentioned this pull request Sep 25, 2018

RFC: Sunset tf.contrib #18

Merged

asimshankar added 2 commits October 2, 2018 13:10

Additional details on Trace Caches and Input Signatures.

c3fe035

s/tf.defun/tf.function/

33f380d

Collapse tf.method and tf.function.

f9b64c0

(Prototyped in tensorflow/tensorflow@84ace03)

facaiy reviewed Oct 14, 2018

View reviewed changes

rfcs/20180918-functions-not-sessions-20.md Outdated Show resolved Hide resolved

Fix typo

868b110

alanhdu reviewed Oct 17, 2018

View reviewed changes

rfcs/20180918-functions-not-sessions-20.md Show resolved Hide resolved

tensorflow-copybara pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 18, 2018

TF 2.0: tf.Session -> tf.compat.v1.Session

aaaf190

(In the spirit of tensorflow/community#20) PiperOrigin-RevId: 217627136

goldiegadde self-assigned this Oct 30, 2018

Formatting tweak

765263d

asimshankar requested a review from goldiegadde as a code owner October 31, 2018 03:05

asimshankar mentioned this pull request Nov 6, 2018

TensorFlow session.run() overhead for graphs with few flops tensorflow/tensorflow#120

Closed

Fix typo

f6bab47

Update note on tracing twice.

ba6c79e

goldiegadde closed this Nov 16, 2018

ewilderj reopened this Nov 16, 2018

ewilderj merged commit feda983 into tensorflow:master Nov 16, 2018

ewilderj added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Nov 16, 2018

bendecoste mentioned this pull request Jan 23, 2019

Allow enabled of eager execution tf-encrypted/tf-encrypted#279

Closed

mrry mentioned this pull request Feb 22, 2019

run_metadata should be preserved when session run fails tensorflow/tensorflow#25990

Closed

facaiy mentioned this pull request Mar 9, 2019

Migrate dense image warp tensorflow/addons#53

Merged

facaiy mentioned this pull request Mar 19, 2019

AttributeError: 'module' object has no attribute 'Session' in tensorflow/tensorflow:latest-gpu-jupyter TF 2.0-alpah0 in Docker Container tensorflow/tensorflow#26816

Closed

a-price mentioned this pull request Mar 25, 2019

TensorFlow 2.0 UM-ARM-Lab/mps_shape_completion#3

Closed

RFC: Functions, not sessions in 2.0 #20

RFC: Functions, not sessions in 2.0 #20

Uh oh!

Conversation

asimshankar commented Sep 20, 2018 • edited by ewilderj Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Functions, not sessions in 2.0

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ngc92 commented Sep 20, 2018

Uh oh!

Uh oh!

carlthome commented Sep 21, 2018

Uh oh!

omoindrot commented Sep 21, 2018

Uh oh!

seanpmorgan commented Sep 21, 2018

Uh oh!

seanpmorgan commented Sep 21, 2018

Uh oh!

ngc92 commented Sep 21, 2018

Uh oh!

samjabrahams commented Sep 22, 2018

Uh oh!

asimshankar commented Sep 26, 2018

Uh oh!

asimshankar commented Sep 26, 2018

Uh oh!

Uh oh!

Uh oh!

goldiegadde commented Oct 30, 2018

Uh oh!

asimshankar commented Nov 16, 2018

Uh oh!

goldiegadde commented Nov 16, 2018

Uh oh!

asimshankar commented Nov 16, 2018

Uh oh!

goldiegadde commented Nov 16, 2018

Uh oh!

Uh oh!

asimshankar commented Sep 20, 2018 •

edited by ewilderj

Loading