-
Notifications
You must be signed in to change notification settings - Fork 74.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Make streaming metrics resettable #4814
Comments
@untom Thanks for filing the issue! The request sounds reasonable to me. We welcome contributions! Maybe @nathansilberman @langmore might know whether there's some existing way to reset streaming metrics; I couldn't come up with anything either. |
Streaming metrics add two local variables import tensorflow as tf
value = 0.1
with tf.name_scope('foo'):
mean_value, update_op = tf.contrib.metrics.streaming_mean(value)
init_op = [tf.initialize_variables(tf.local_variables())]
stream_vars = [i for i in tf.local_variables() if i.name.split('/')[0] == 'foo']
reset_op = [tf.initialize_variables(stream_vars)]
with tf.Session() as sess:
sess.run(init_op)
for j in range(3):
for i in range(9):
_, total, count = sess.run([update_op] + stream_vars)
mean_val = sess.run([mean_value])
print total, count, mean_val
sess.run(reset_op)
print '' |
Thanks, that is very useful. It seems to me that i would make sense if the metrics would all return a |
Agree with @untom . |
Thank you @AshishBora for your guidance. However I am encountering an issue using TF-Slim. The problem is that the TF-Slim use tf.train.Supervisor() and after that the graph is finalized and cannot be be modified. I get the following error:
So the above solution won't work. Is there any other solution? |
You can run the following to reset the local variables used by metric computation: To avoid the issue of graph finalization, just create a reference to this op BEFORE session creation: reset_op = tf.local_variables_initializer() |
Note that this is tightly coupled to #9498. |
Thank you @nathansilberman . |
Alternatively, you can always define the metrics within a name scope and
only reset the local variables within that name scope.
…On Fri, Jun 2, 2017 at 11:01 AM, Amirsina Torfi ***@***.***> wrote:
Thank you @nathansilberman <https://github.com/nathansilberman> .
Unfortunately it didn't work. Looks like the supervisor is much smarter
than that!
Moreover I the the problem with tf.local_variables_initializer() is that
by defining the reset_op it may reset all variables.
I am not %100 sure though.
But thank you so much for the hint. I think I may find the solution from
this.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4814 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AUMnHAfTXrDFN2hDwEKY06TUPgPd57jNks5sACPjgaJpZM4KQz81>
.
|
@untom I made it. Thanks to @nathansilberman . I made a pull request to tensorflow for changing slim.learning.train and I added a reset_op for metric. pull request: |
@untom I just stumbled upon your discussion. I am struggling with the same issue and I just started wrapping all my code up here. Please, note that it is still a development and unstable branch and may rapidly change in the next few hours. Any feedback or suggestion is more than welcome. The idea is to have a easy way for having both streaming computation and batch-based computation of metrics. I would be carefule with the idea suggested @AshishBora since it sounds like breaking encapsulation -- but I see that's the quickest way to go. |
Concluding the suggestions by @AshishBora (#4814 (comment)) and @nathansilberman (#4814 (comment)) I came up with this function to create all three ops with one call, while keeping the variables encapsulated (fixed a side effect issue @untom mentioned (#4814 (comment))): def create_reset_metric(metric, scope='reset_metrics', **metric_args):
with tf.variable_scope(scope) as scope:
metric_op, update_op = metric(**metric_args)
vars = tf.contrib.framework.get_variables(
scope, collection=tf.GraphKeys.LOCAL_VARIABLES)
reset_op = tf.variables_initializer(vars)
return metric_op, update_op, reset_op An example to create the operations inside the graph is then like this: epoch_loss, epoch_loss_update, epoch_loss_reset = create_reset_metric(
tf.contrib.metrics.streaming_mean_squared_error, 'epoch_loss',
predictions=output, labels=target) |
does local_variables_initializer only return variables from the current scope? Otherwise this can have bad side effects |
Indeed, it has! Thanks, I changed it to use |
what's the best practice to use this import tensorflow as tf
pred = tf.placeholder(shape=[3],dtype=tf.int32)
label = tf.placeholder(shape=[3],dtype=tf.int32)
v,op = tf.metrics.accuracy(pred,label)
reset_op = tf.local_variables_initializer()
sess = tf.InteractiveSession()
sess.run(reset_op)
for i in range(10):
print(sess.run([v,op,reset_op],{pred:[0,1,2],label:[0,1,2]})) I get
I add control dependency to this import tensorflow as tf
pred = tf.placeholder(shape=[3],dtype=tf.int32)
label = tf.placeholder(shape=[3],dtype=tf.int32)
v,op = tf.metrics.accuracy(pred,label)
reset_op = tf.local_variables_initializer()
op = tf.tuple([op],control_inputs=[reset_op])[0]
v = tf.tuple([v],control_inputs=[op])[0]
sess = tf.InteractiveSession()
sess.run(reset_op)
for i in range(10):
print(sess.run([v],{pred:[0,1,2],label:[0,1,2]})) results is wired too
|
You are doing what I did wrong in the first place: You are using the with tf.variable_scope("reset_metrics_accuracy_scope") as scope:
v, op = tf.metrics.accuracy(pred, label)
vars = tf.contrib.framework.get_variables(scope, collection=tf.GraphKeys.LOCAL_VARIABLES)
reset_op = tf.variables_initializer(vars) You would then call |
@shoeffner : I'm not sure if this has changed in the last version 1.3 of TF, but 'vars' is an empty list when I use your snippet above. To be more precise, I always run into the exception of my code, which is based on your proposal: class MetricOp(object):
def __init__(self, name, value, update, reset):
self._name = name
self._value = value
self._update = update
self._reset = reset
@property
def name(self):
return self._name
@property
def value(self):
return self._value
@property
def update(self):
return self._update
@property
def reset(self):
return self._reset
def create_metric(scope: str, metric: callable, **metric_args) -> MetricOp:
with tf.variable_scope(scope) as scope:
metric_op, update_op = metric(**metric_args)
scope_vars = tf.contrib.framework.get_variables(
scope, collection=tf.GraphKeys.LOCAL_VARIABLES)
if len(scope_vars) == 0:
raise Exception("No local variables found.")
reset_op = tf.variables_initializer(scope_vars)
return MetricOp('MetricOp', metric_op, update_op, reset_op) Sorry, could fix that by myself. It was caused by wrapping my coll to this function with a name_scope: with tf.name_scope('Metrics'): # <---
targets = tf.argmax(y, 1)
accuracy_op = metrics.create_metric("Accuracy", tf.metrics.accuracy,
labels=targets, predictions=model.prediction_op)
precision_op = metrics.create_metric("Precision", tf.metrics.precision,
labels=targets, predictions=model.prediction_op) |
@bsautermeister
|
Hi, Thank you for the leads. I am new to TF. I like to use the separate evaluation process in |
@studentSam0000 You might want to take a look at this tf slim example, where they already use some metrics. However, if you really want to reset your streaming accuracy, you could probably use a hook in which you call the reset op, something along these lines: class ResetHook(tf.train.SessionRunHook):
"""Hook to perform reset metrics every N steps."""
def __init__(self, reset_op, every_step=50):
self.reset_op = reset_op
self.every_step = every_step
self.reset = False
def begin(self):
self._global_step_tensor = tf.train.get_global_step()
if self._global_step_tensor is None:
raise RuntimeError("Global step should be created to use ResetHook.")
def before_run(self, run_context):
if self.reset:
return tf.train.SessionRunArgs(fetches=self.reset_op)
return tf.train.SessionRunArgs(fetches=self._global_step_tensor)
def after_run(self, run_context, run_values):
if self.reset:
self.reset = False
return
global_step = run_values.results
if global_step % self.every_step == 0:
self.reset = True Using with the snippet above (#4814 (comment)) you can then build something like this: epoch_loss, epoch_loss_update, epoch_loss_reset = create_reset_metric(
tf.contrib.metrics.streaming_mean_squared_error, 'epoch_loss',
predictions=output, labels=target)
reset_hook = ResetHook(epoch_loss_reset, 10)
tf.contrib.slim.evaluation.evaluation_loop('local', 'checkpoints', 'logs', num_evals=1000,
..., hooks=[reset_hook]) I haven't tested the hook, but I used a similar hook to perform traces a while ago and just adjusted that one a little bit, but you should get the idea of how it works. |
I've been using the |
@gadagashwini With TF 2, this can be closed. |
@rmothukuru With TF 2, this can be now closed (the |
@foxik, |
Hi!
Currently, the streaming metrics are (as far as I know) not resettable. I'd like to be able to e.g. reset the counter after each epoch. This way, having e.g. a very bad accuracy in the beginning of training will not still influence the accuracy value ten epochs later. It makes it easier to compare my results to runs obtained outside tensorflow.
The only workaround I found is to do
sess.run(tf.initialize_local_variables())
after each epoch, but of course this can have bad side effects if I have other local variables that I don't want to reset.Or is there a way to achieve what I want that I didn't think of?
The text was updated successfully, but these errors were encountered: