Add python wrapper for CTC greedy decoder and edit distance evaluator #7655

wanghaoshuang · 2018-01-18T09:06:12Z

… ctc_evaluator_py

kuke

I suggest to divide this operator into two, ctc_greedy_decoder and edit_distance_evaluator. Please always remember that there is another decoding method which is beam search. Users may want to output the decoding result directly, or they want to use beam search decoding and then evaluate the result. It would be unfeasible if wrapping greedy decoding and error evaluating in one operator.

wanghaoshuang · 2018-01-18T11:15:59Z

@kuke Thanks for your reminder.

… ctc_evaluator_py

2. Add edit distance evaluator to evaluator.py

… ctc_evaluator_py

qingqing01 · 2018-01-22T05:11:12Z

python/paddle/v2/fluid/evaluator.py

+            dtype='float32', shape=[1], suffix='total')
+        error = layers.edit_distance(input=input, label=label)
+        error = layers.cast(x=error, dtype='float32')
+        mean_error = layers.mean(x=error)


Better to make consistent with Accuracy evaluator, do not calculate the average mean error of current mini-batch. Just accumulate the batch size and mini-batch error here.

qingqing01 · 2018-01-22T05:12:39Z

python/paddle/v2/fluid/evaluator.py

+
+class EditDistance(Evaluator):
+    """
+    Average edit distance error for multiple mini-batches.


Need more comments, How to usage, and what the returned value by eval means.

qingqing01 · 2018-01-22T05:12:59Z

python/paddle/v2/fluid/layers/nn.py

+        tokens(list): Tokens that should be removed before calculating edit distance.
+
+    Returns:
+        Variable: sequence-to-sequence edit distance loss in shape [batch_size, 1].


Remove loss.

qingqing01 · 2018-01-22T05:17:36Z

python/paddle/v2/fluid/layers/nn.py

@@ -1863,6 +1864,140 @@ def matmul(x, y, transpose_x=False, transpose_y=False, name=None):
    return out


+def edit_distance(input, label, normalized=False, tokens=None, name=None):


tokens -> ignored_tokens ?

Thx. Fixed.

qingqing01 · 2018-01-22T05:18:00Z

python/paddle/v2/fluid/layers/nn.py

+
+        normalized(bool): Indicated whether to normalize the edit distance by the length of reference string.
+
+        tokens(list): Tokens that should be removed before calculating edit distance.


tokens(list) -> tokens(list of int)

2. Fix evaluator using 'reduce_sum' op instead of 'mean' op

qingqing01

LGTM. Approved. Please to add the unit test in next PR.

wanghaoshuang added 3 commits January 18, 2018 10:58

Add python wrapper for ctc_evaluator

0dd3919

Add comments

082c302

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

2ca603b

… ctc_evaluator_py

wanghaoshuang requested review from kuke, qingqing01 and pkuyym January 18, 2018 09:06

kuke reviewed Jan 18, 2018

View reviewed changes

wanghaoshuang added 2 commits January 18, 2018 20:40

divide this operator into ctc_greedy_decoder and edit_distance_error.

4673a4a

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

01d568e

… ctc_evaluator_py

wanghaoshuang changed the title ~~Add python wrapper for CTC evaluator~~ Add python wrapper for CTC greedy decoder and edit distance evaluator Jan 19, 2018

wanghaoshuang added 5 commits January 19, 2018 14:53

1. Rename 'edit_distance_error' to 'edit_distance'

5846aab

2. Add edit distance evaluator to evaluator.py

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

25dec82

… ctc_evaluator_py

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

680aec2

… ctc_evaluator_py

Add EditDistance to evaluator.py

a8f118c

Add sequence_erase option into edit distance python API

0b854bd

qingqing01 reviewed Jan 22, 2018

View reviewed changes

wanghaoshuang added 3 commits January 22, 2018 16:59

1. Add sequence_num as edit distance op's output

1bc8de3

2. Fix evaluator using 'reduce_sum' op instead of 'mean' op

1. Add more comments

8143a42

Fix white space in comments.

d9d9be1

qingqing01 approved these changes Jan 22, 2018

View reviewed changes

wanghaoshuang merged commit 44561a2 into PaddlePaddle:develop Jan 22, 2018

wanghaoshuang deleted the ctc_evaluator_py branch January 22, 2018 14:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add python wrapper for CTC greedy decoder and edit distance evaluator #7655

Add python wrapper for CTC greedy decoder and edit distance evaluator #7655

wanghaoshuang commented Jan 18, 2018

kuke left a comment

wanghaoshuang commented Jan 18, 2018

qingqing01 Jan 22, 2018 •

edited

Loading

wanghaoshuang Jan 22, 2018

qingqing01 Jan 22, 2018

wanghaoshuang Jan 22, 2018

qingqing01 Jan 22, 2018

wanghaoshuang Jan 22, 2018

qingqing01 Jan 22, 2018

wanghaoshuang Jan 22, 2018

qingqing01 Jan 22, 2018 •

edited

Loading

wanghaoshuang Jan 22, 2018

qingqing01 left a comment

		@@ -1863,6 +1864,140 @@ def matmul(x, y, transpose_x=False, transpose_y=False, name=None):
		return out


		def edit_distance(input, label, normalized=False, tokens=None, name=None):


		normalized(bool): Indicated whether to normalize the edit distance by the length of reference string.

		tokens(list): Tokens that should be removed before calculating edit distance.

Add python wrapper for CTC greedy decoder and edit distance evaluator #7655

Add python wrapper for CTC greedy decoder and edit distance evaluator #7655

Conversation

wanghaoshuang commented Jan 18, 2018

kuke left a comment

Choose a reason for hiding this comment

wanghaoshuang commented Jan 18, 2018

qingqing01 Jan 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qingqing01 Jan 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qingqing01 left a comment

Choose a reason for hiding this comment

qingqing01 Jan 22, 2018 •

edited

Loading

qingqing01 Jan 22, 2018 •

edited

Loading