Add PrecisionRecall Op #5111

pkuyym · 2017-10-26T04:09:24Z

This operator can be used to compute various metrics including:

macro average precision
macro average recall
macro f1 score
micro average precision
micro average recall
micro f1 score

To compute the above metrics, we need to statistic counts for true positives, false positives and false negatives. Here count of true negatives is not necessary, but statisticing it may provide potential usage and the cost is trivial, so the operator also provides count of true negatives.

We define state as a 2-D tensor with shape [class number, 4]. Each row of a state contains statistic variables for corresponding class. Layout of each row is: TP(true positives), FP(false positives), TN(true negatives), FN(false negatives). If 'Input(Weights)' provided, TP, FP, TN, FN will be calculated by given weight instead of instance count.

This operator also supports metrics computing for cross-batch situation. To achieve this, 'Input(StatesInfo)' should be provided. State of current batch data will be accumulated to 'Input(StatesInfo)' and 'Output(AccumStatesInfo)' is the accumulation state.

'Output(BatchMetrics)' is metrics of current batch data while 'Output(AccumStatesInfo)' is metrics of accumulation data.

typhoonzero

Anyway, I can approve this for now until we have a better evaluator network implementation.

typhoonzero · 2017-10-27T04:50:01Z

paddle/operators/precision_recall_op.cc

+  - micro average recall
+  - micro f1 score
+
+To compute the above metrics, we need to statistic counts for true positives,


we need to statistic counts for => we need statistics to count...

typhoonzero · 2017-10-27T04:50:53Z

paddle/operators/precision_recall_op.cc

+
+To compute the above metrics, we need to statistic counts for true positives,
+false positives and false negatives. Here count of true negatives is not
+necessary, but statisticing it may provide potential usage and the cost is


"statistic" is a noun, not a verb. Change it to "calculating".

typhoonzero · 2017-10-27T05:24:48Z

paddle/operators/precision_recall_op.h

+    out2->mutable_data<T>(ctx.GetPlace());
+    auto accum_states = EigenMatrix<T>::From(*out2);
+    accum_states.setZero();
+    T* accum_states_data = out2->data<T>();


Thought accumulating should be a more general method, like in #4828 (comment) , we can use an accumulating operator to generate more configurable evaluating subnetwork.

typhoonzero · 2017-10-27T05:31:32Z

paddle/operators/precision_recall_op.h

+    const T* weights_data = in2 ? in2->data<T>() : nullptr;
+    const T* states_data = in3 ? in3->data<T>() : nullptr;
+    T* batch_metrics_data = out0->mutable_data<T>(ctx.GetPlace());
+    T* accum_metrics_data = out1->mutable_data<T>(ctx.GetPlace());


Outputs of these metrics can just be type float or double, type T should not infect the output type.

typhoonzero · 2017-10-27T08:20:25Z

paddle/operators/precision_recall_op.h

+    for (size_t i = 0; i < sample_num; ++i) {
+      size_t max_idx = 0;
+      T max_val = predictions_data[i * class_dim];
+      for (size_t j = 1; j < class_dim; ++j) {


You can assume the input predictions are outputs of topk op so you don't need to find the max probability here.

topk will output both probability and indices.

typhoonzero · 2017-10-30T03:31:08Z

paddle/operators/precision_recall_op.h

+class PrecisionRecallKernel : public framework::OpKernel<T> {
+ public:
+  void Compute(const framework::ExecutionContext& ctx) const override {
+    auto* in0 = ctx.Input<Tensor>("Predictions");


These temp var names are not quite human-readable.

typhoonzero

LGTM!

Add CPU kernel.

06c7c8c

qingqing01 added the OpPorting label Oct 26, 2017

Add and pass unittests.

65dbbd5

pkuyym changed the title ~~Add PrecisionRecall Op [WIP]~~ Add PrecisionRecall Op Oct 27, 2017

Add comments.

97bfc0d

pkuyym force-pushed the fix-5070 branch from 8b44bb7 to 97bfc0d Compare October 27, 2017 03:11

pkuyym requested a review from typhoonzero October 27, 2017 03:15

typhoonzero reviewed Oct 27, 2017

View reviewed changes

Refine doc and fix data type of metrics.

d2b10cc

typhoonzero reviewed Oct 27, 2017

View reviewed changes

typhoonzero reviewed Oct 30, 2017

View reviewed changes

Refine and follow comments.

970613f

typhoonzero approved these changes Nov 2, 2017

View reviewed changes

pkuyym merged commit 8cdb42c into PaddlePaddle:develop Nov 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PrecisionRecall Op #5111

Add PrecisionRecall Op #5111

pkuyym commented Oct 26, 2017 •

edited

Loading

typhoonzero left a comment

typhoonzero Oct 27, 2017

pkuyym Oct 27, 2017

typhoonzero Oct 27, 2017

pkuyym Oct 27, 2017

typhoonzero Oct 27, 2017

typhoonzero Oct 27, 2017

pkuyym Oct 27, 2017

typhoonzero Oct 27, 2017

typhoonzero Oct 27, 2017

typhoonzero Oct 30, 2017

typhoonzero left a comment

Add PrecisionRecall Op #5111

Add PrecisionRecall Op #5111

Conversation

pkuyym commented Oct 26, 2017 • edited Loading

typhoonzero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

typhoonzero left a comment

Choose a reason for hiding this comment

pkuyym commented Oct 26, 2017 •

edited

Loading