Simplify train.py, evaluate.py, infer.py and tune.py by adding DeepSpeech2Model class for DS2. #183

xinghai-sun · 2017-08-01T08:31:35Z

Move functions in model.py into layer.py
Add a DeepSpeech2Model class in model.py, which has train() and infer_batch() methods.
Simplify train.py, evaluate.py, infer.py and tune.py。

…infer.py for DS2.

…eech2Model class.

kuke

Great! Almost LGTM.

kuke · 2017-08-02T02:45:11Z

deep_speech_2/tune.py

@@ -62,10 +66,10 @@
    type=str,
    help="Manifest path for normalizer. (default: %(default)s)")
 parser.add_argument(
-    "--decode_manifest_path",
+    "--tune_manifest_path",
    default='datasets/manifest.test',


manifest.dev would be better.

kuke · 2017-08-02T03:01:13Z

deep_speech_2/layer.py

+            padding=(5, 10),
+            act=paddle.activation.BRelu())
+    output_num_channels = 32
+    output_height = 160 // pow(2, num_stacks) + 1


Can we figure out a way to avoid hardcode here?

Will be fixed in a later PR (this problem is not coming from current PR).

kuke · 2017-08-02T03:17:29Z

deep_speech_2/train.py

@@ -127,100 +129,47 @@



Is it necessary to add output_model_dir into parser as an argument?

kuke · 2017-08-02T03:34:49Z

deep_speech_2/layer.py

+
+def conv_group(input, num_stacks):
+    """
+    Convolution group with several stacking convolution layers.


Seems that stacked is often used instead of stacking. The same below.

pkuyym · 2017-08-02T03:34:43Z

deep_speech_2/evaluate.py

+        for target, result in zip(target_transcripts, result_transcripts):
+            wer_sum += wer(target, result)
+            num_ins += 1
+        print("WER (%d/?) = %f" % (num_ins, wer_sum / num_ins))


what does ? mean?

It refers to an unknown size of the validation set. It will become known at the end of the evaluation.

pkuyym · 2017-08-02T03:36:41Z

deep_speech_2/layer.py

+            padding=(5, 10),
+            act=paddle.activation.BRelu())
+    output_num_channels = 32
+    output_height = 160 // pow(2, num_stacks) + 1


Since the model is refactored, please make 160 exposed instead of hard-coding.

Will be fixed in a later PR (this problem is not coming from current PR).

pkuyym · 2017-08-02T03:43:40Z

deep_speech_2/layer.py

+
+import paddle.v2 as paddle
+
+DISABLE_CUDNN_BATCH_NORM = True


Why not make DISABLE_CUDNN_BATCH_NORM a parameter of conv_bn_layer? If hard coding here, user has to modify this file when training on cpu mode.

Removed since the cudnn problem has been fixed.

pkuyym · 2017-08-02T03:46:59Z

deep_speech_2/model.py

+                    reader=dev_batch_reader, feeding=feeding_dict)
+                output_model_path = os.path.join(
+                    output_model_dir, "params.pass-%d.tar.gz" % event.pass_id)
+                with gzip.open(output_model_path, 'w') as f:


Maybe we should make sure output_model_dir exist, the saving operation may fail if the directory is not exist.

pkuyym · 2017-08-02T03:51:37Z

deep_speech_2/model.py

+        # of input batch data will be induced during training.
+        audio_data = paddle.layer.data(
+            name="audio_spectrogram",
+            type=paddle.data_type.dense_array(161 * 161))


Just curious about 161 * 161, is this setting proper for other type of feature like mfcc?

Will fix this in another PR. Lets' just do not bring in any difference to this PR.

kuke · 2017-08-02T04:23:12Z

deep_speech_2/model.py

+    def _create_network(self, vocab_size, num_conv_layers, num_rnn_layers,
+                        rnn_layer_size):
+        # paddle.data_type.dense_array is used for variable batch input.
+        # The size 161 * 161 is only an placeholder value and the real shape


The log information about output dimensions of conv layers is induced from here, which is misleading if this placeholder is not consistent with the real shape. Maybe a better choice is to use the real dimension of feature vector.

Will fix this in another PR.

pkuyym

LGTM

xinghai-sun added 2 commits July 31, 2017 22:55

Update default config params and result display for evaluator.py and …

91ff816

…infer.py for DS2.

Simplify train.py, evaluate.py, infer.py and tune.py by adding DeepSp…

8e44743

…eech2Model class.

xinghai-sun requested review from kuke and pkuyym August 1, 2017 08:31

kuke reviewed Aug 2, 2017

View reviewed changes

pkuyym requested changes Aug 2, 2017

View reviewed changes

kuke reviewed Aug 2, 2017

View reviewed changes

Add function docs for layer.py and model.py and update other details.

589aaae

pkuyym approved these changes Aug 2, 2017

View reviewed changes

xinghai-sun merged commit 9807068 into PaddlePaddle:develop Aug 2, 2017

xinghai-sun deleted the refine_decoder2 branch August 2, 2017 17:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify train.py, evaluate.py, infer.py and tune.py by adding DeepSpeech2Model class for DS2. #183

Simplify train.py, evaluate.py, infer.py and tune.py by adding DeepSpeech2Model class for DS2. #183

xinghai-sun commented Aug 1, 2017

kuke left a comment

kuke Aug 2, 2017

xinghai-sun Aug 2, 2017

kuke Aug 2, 2017

xinghai-sun Aug 2, 2017

kuke Aug 2, 2017

xinghai-sun Aug 2, 2017

kuke Aug 2, 2017

xinghai-sun Aug 2, 2017

pkuyym Aug 2, 2017

xinghai-sun Aug 2, 2017

pkuyym Aug 2, 2017

xinghai-sun Aug 2, 2017

pkuyym Aug 2, 2017

xinghai-sun Aug 2, 2017

pkuyym Aug 2, 2017

xinghai-sun Aug 2, 2017

pkuyym Aug 2, 2017

xinghai-sun Aug 2, 2017

kuke Aug 2, 2017

xinghai-sun Aug 2, 2017

pkuyym left a comment

Simplify train.py, evaluate.py, infer.py and tune.py by adding DeepSpeech2Model class for DS2. #183

Simplify train.py, evaluate.py, infer.py and tune.py by adding DeepSpeech2Model class for DS2. #183

Conversation

xinghai-sun commented Aug 1, 2017

kuke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkuyym left a comment

Choose a reason for hiding this comment