What-If Tool: thresholded inference problem (confusion matrix/ROC)

Version info:
* TensorBoard 1.12.0a0
* TensorFlow 1.8.0
* MacOS 10.13.6
* Python 2.7

Description:
Running 2-class classification with a custom estimator results in incorrect confusion matrix/ROC curve values. When dragging the threshold slider, the "actual yes/no" percentages change (see screenshots). Other than that, when using a vocab file to specify the labels ("False", "True"), the legend shows "False" and "undefined". The inference scores seem to return correctly.

<img width="586" alt="screen shot 2018-09-26 at 3 52 06 pm" src="https://user-images.githubusercontent.com/12446870/46113724-60585880-c1a4-11e8-855c-ca0ec8694932.png">
<img width="592" alt="screen shot 2018-09-26 at 3 51 56 pm" src="https://user-images.githubusercontent.com/12446870/46113725-60585880-c1a4-11e8-98e0-ea629963aa2c.png">
<img width="174" alt="screen shot 2018-09-26 at 3 56 49 pm" src="https://user-images.githubusercontent.com/12446870/46113825-cfce4800-c1a4-11e8-9ffd-e1a65331d6c2.png">

I would assume, even if my model would be incorrect, that the "actual" samples are unrelated to the threshold set.

Context:
The classification API is used as
```
signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: tf.estimator.export.ClassificationOutput(scores=softmax, classes=None)
```
with softmax a `(?,2)`-shaped Tensor. This leads to the following signature:
```
The given SavedModel SignatureDef contains the following input(s):
  inputs['inputs'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['scores'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 2)
      name: softmax/Reshape_1:0
Method name is: tensorflow/serving/classify
```

The ground truth is specified via a numeric integer value in [0,1] (about 97% 0 and 3% 1).

<img width="663" alt="screen shot 2018-09-26 at 3 51 47 pm" src="https://user-images.githubusercontent.com/12446870/46113726-60585880-c1a4-11e8-8a57-9bfa2f392edc.png">

Inference result as shown in the datapoint editor:
<img width="786" alt="screen shot 2018-09-26 at 3 57 40 pm" src="https://user-images.githubusercontent.com/12446870/46113875-f7251500-c1a4-11e8-88fd-48c45ac94f48.png">



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What-If Tool: thresholded inference problem (confusion matrix/ROC) #1463

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What-If Tool: thresholded inference problem (confusion matrix/ROC) #1463

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions