[Hackathon 7th] 修复 `s2t` 示例错误 #3950

megemini · 2024-12-12T14:53:00Z

PR types

Bug fixes

PR changes

Others

Describe

修复 s2t 示例错误：

paddlespeech/s2t/io/dataloader.py 中，如果走 train 分支，那么 config 会缺少很多配置项
paddlespeech/s2t/models/u2_st/u2_st.py 中，使用的是 TransformerDecoder，这个类中 forward 有 3 个返回值，因此使用 *_ 屏蔽掉第一个之后的返回值。之所以不使用 decoder_out, _, _ = self.decode... 的方式，是因为，TransformerDecoder 的 forward 原来的输出可能是 2 个（之前的 typing 只有两个返回值，这里同时修改为三个），因此，用 *_ 做兼容性处理。
paddlespeech/s2t/frontend/featurizer/text_featurizer.py 的输入可能是嵌套的 list，因此这里也做了判断处理。
修复了测试过程中其他问题

目前测试暂时没啥问题，日志：

aistudio@jupyter-942478-8657745:~/PaddleSpeech/examples/ted_en_zh/st0$ bash run.sh --stage 0 --stop_stage 0
checkpoint name transformer_mtl_noam
Creating manifest data/manifest ...
train Processed: 1000
train Processed: 2000
train Processed: 3000
train Processed: 4000
manifest prepare done!
Complete raw data pre-process.
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:686: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
----------- compute_mean_std.py Configuration Arguments -----------
delta_delta: 0
feat_dim: 80
manifest_path: data/manifest.train.raw
num_samples: -1
num_workers: 24
output_path: data/mean_std.json
sample_rate: 16000
spectrum_type: fbank
stride_ms: 10
target_dB: -20
use_dB_normalization: 0
window_ms: 25
-----------------------------------------------------------
2024-12-12 21:34:00.167 | INFO     | paddlespeech.s2t.frontend.augmentor.augmentation:__init__:122 - Augmentation: []
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:686: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
----------- build_vocab.py Configuration Arguments -----------
count_threshold: 0
manifest_paths: ['data/manifest.train.raw']
spm_character_coverage: 1.0
spm_mode: unigram
spm_model_prefix: data/lang_char/bpe_unigram_8000
spm_vocab_size: 8000
text_keys: ['text']
unit_type: spm
vocab_path: data/lang_char/vocab.txt
-----------------------------------------------------------
sentencepiece_trainer.cc(78) LOG(INFO) Starts training with : 
trainer_spec {
  input: /tmp/tmpliv9c922
  input_format: 
  model_prefix: data/lang_char/bpe_unigram_8000
  model_type: UNIGRAM
  vocab_size: 8000
  self_test_sample_size: 0
  character_coverage: 1
  input_sentence_size: 100000000
  shuffle_input_sentence: 1
  seed_sentencepiece_size: 1000000
  shrinking_factor: 0.75
  max_sentence_length: 4192
  num_threads: 16
  num_sub_iterations: 2
  max_sentencepiece_length: 16
  split_by_unicode_script: 1
  split_by_number: 1
  split_by_whitespace: 1
  split_digits: 0
  pretokenization_delimiter: 
  treat_whitespace_as_suffix: 0
  allow_whitespace_only_pieces: 0
  required_chars: 
  byte_fallback: 0
  vocabulary_output_piece_score: 1
  train_extremely_large_corpus: 0
  seed_sentencepieces_file: 
  hard_vocab_limit: 1
  use_all_vocab: 0
  unk_id: 0
  bos_id: 1
  eos_id: 2
  pad_id: -1
  unk_piece: <unk>
  bos_piece: <s>
  eos_piece: </s>
  pad_piece: <pad>
  unk_surface:  ⁇ 
  enable_differential_privacy: 0
  differential_privacy_noise_level: 0
  differential_privacy_clipping_threshold: 0
}
normalizer_spec {
  name: nmt_nfkc
  add_dummy_prefix: 1
  remove_extra_whitespaces: 1
  escape_whitespaces: 1
  normalization_rule_tsv: 
}
denormalizer_spec {}
trainer_interface.cc(353) LOG(INFO) SentenceIterator is not specified. Using MultiFileSentenceIterator.
trainer_interface.cc(185) LOG(INFO) Loading corpus: /tmp/tmpliv9c922
trainer_interface.cc(409) LOG(INFO) Loaded all 9996 sentences
trainer_interface.cc(425) LOG(INFO) Adding meta_piece: <unk>
trainer_interface.cc(425) LOG(INFO) Adding meta_piece: <s>
trainer_interface.cc(425) LOG(INFO) Adding meta_piece: </s>
trainer_interface.cc(430) LOG(INFO) Normalizing sentences...
trainer_interface.cc(539) LOG(INFO) all chars count=722011
trainer_interface.cc(560) LOG(INFO) Alphabet size=2614
trainer_interface.cc(561) LOG(INFO) Final character coverage=1
trainer_interface.cc(592) LOG(INFO) Done! preprocessed 9996 sentences.
unigram_model_trainer.cc(265) LOG(INFO) Making suffix array...
unigram_model_trainer.cc(269) LOG(INFO) Extracting frequent sub strings... node_num=330954
unigram_model_trainer.cc(312) LOG(INFO) Initialized 28324 seed sentencepieces
trainer_interface.cc(598) LOG(INFO) Tokenizing input sentences with whitespace: 9996
trainer_interface.cc(609) LOG(INFO) Done! 18607
unigram_model_trainer.cc(602) LOG(INFO) Using 18607 sentences for EM training
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=0 size=13035 obj=11.1349 num_tokens=35546 num_tokens/piece=2.72697
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=1 size=11469 obj=9.40258 num_tokens=35695 num_tokens/piece=3.1123
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=0 size=8799 obj=9.68791 num_tokens=39376 num_tokens/piece=4.47505
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=1 size=8797 obj=9.63443 num_tokens=39452 num_tokens/piece=4.48471
trainer_interface.cc(687) LOG(INFO) Saving model: data/lang_char/bpe_unigram_8000.model
trainer_interface.cc(699) LOG(INFO) Saving vocabs: data/lang_char/bpe_unigram_8000.vocab
2024-12-12 21:35:45.976 | WARNING  | paddlespeech.s2t.frontend.featurizer.text_featurizer:__init__:58 - TextFeaturizer: not have vocab file or vocab list. Only Tokenizer can use, can not convert to token idx
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:686: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:686: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:686: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
----------- format_data.py Configuration Arguments -----------
cmvn_path: data/mean_std.json
manifest_paths: ['data/manifest.train.raw']
output_path: data/manifest.train
spm_model_prefix: data/lang_char/bpe_unigram_8000
unit_type: spm
vocab_path: data/lang_char/vocab.txt
-----------------------------------------------------------
Feature dim: 80
Vocab size: 7953
----------- format_data.py Configuration Arguments -----------
cmvn_path: data/mean_std.json
manifest_paths: ['data/manifest.test.raw']
output_path: data/manifest.test
spm_model_prefix: data/lang_char/bpe_unigram_8000
unit_type: spm
vocab_path: data/lang_char/vocab.txt
-----------------------------------------------------------
Feature dim: 80
Vocab size: 7953
['data/manifest.test.raw'] Examples number: 0
----------- format_data.py Configuration Arguments -----------
cmvn_path: data/mean_std.json
manifest_paths: ['data/manifest.dev.raw']
output_path: data/manifest.dev
spm_model_prefix: data/lang_char/bpe_unigram_8000
unit_type: spm
vocab_path: data/lang_char/vocab.txt
-----------------------------------------------------------
Feature dim: 80
Vocab size: 7953
['data/manifest.dev.raw'] Examples number: 0
['data/manifest.train.raw'] Examples number: 4998
Ted En-Zh Data preparation done.






aistudio@jupyter-942478-8657745:~/PaddleSpeech/examples/ted_en_zh/st0$ CUDA_VISIBLE_DEVICES=0 ./local/train.sh conf/transformer_mtl_noam.yaml transformer_mtl_noam
...
2024-12-12 22:11:13.705 | INFO     | paddlespeech.s2t.exps.u2_st.model:valid:163 - Valid: Rank: 0, epoch: 1, step: 590, batch: 300/313, val_loss: 175.348162, val_att_loss: 151.819175, val_ctc_loss: 417.401444, val_history_st_loss: 175.311639
2024-12-12 22:11:15.551 | INFO     | paddlespeech.s2t.exps.u2_st.model:valid:165 - Rank 0 Val info st_val_loss 170.31878152974346
2024-12-12 22:11:15.553 | INFO     | paddlespeech.s2t.training.timer:__exit__:44 - Eval Time Cost: 0:02:21.742158
2024-12-12 22:11:15.553 | INFO     | paddlespeech.s2t.exps.u2_st.model:do_train:234 - Epoch 1 Val info val_loss 170.31878152974346
2024-12-12 22:11:16.311 | INFO     | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/transformer_mtl_noam/checkpoints/1.pdparams
2024-12-12 22:11:17.600 | INFO     | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/transformer_mtl_noam/checkpoints/1.pdopt
2024-12-12 22:11:19.459 | INFO     | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/transformer_mtl_noam/checkpoints/1.pdparams
2024-12-12 22:11:22.560 | INFO     | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/transformer_mtl_noam/checkpoints/1.pdopt
2024-12-12 22:11:22.563 | INFO     | paddlespeech.s2t.training.timer:__exit__:44 - Training Done: 0:10:24.706460
LAUNCH INFO 2024-12-12 22:11:25,656 Pod completed
LAUNCH INFO 2024-12-12 22:11:25,656 Exit code 0


aistudio@jupyter-942478-8657745:~/PaddleSpeech/examples/ted_en_zh/st0$ avg.sh best exp/transformer_mtl_noam/checkpoints 2
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:686: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
Namespace(dst_model='exp/transformer_mtl_noam/checkpoints/avg_2.pdparams', ckpt_dir='exp/transformer_mtl_noam/checkpoints', val_best=True, num=2, min_epoch=0, max_epoch=65536)
selected val scores = [170.31878153 191.35575619]
selected epochs = [1 0]
averaged val score = 180.8372688606325
['exp/transformer_mtl_noam/checkpoints/1.pdparams', 'exp/transformer_mtl_noam/checkpoints/0.pdparams']
Processing exp/transformer_mtl_noam/checkpoints/1.pdparams
Processing exp/transformer_mtl_noam/checkpoints/0.pdparams
Saving to exp/transformer_mtl_noam/checkpoints/avg_2.pdparams


aistudio@jupyter-942478-8657745:~/PaddleSpeech/examples/ted_en_zh/st0$ CUDA_VISIBLE_DEVICES=0 ./local/test.sh conf/transformer_mtl_noam.yaml conf/tuning/decode.yaml exp/transformer_mtl_noam/checkpoints/avg_2
...
2024-12-12 22:38:38.203 | INFO     | paddlespeech.s2t.exps.u2_st.model:compute_translation_metrics:400 - Hyp: 
2024-12-12 22:38:38.204 | INFO     | paddlespeech.s2t.exps.u2_st.model:compute_translation_metrics:401 - One example BLEU = 0.0/0.0/0.0/0.0
2024-12-12 22:38:38.207 | INFO     | paddlespeech.s2t.exps.u2_st.model:test:441 - RTF: 0.000048, instance (78), batch BELU   = 0.000000
2024-12-12 22:38:39.578 | INFO     | paddlespeech.s2t.exps.u2_st.model:compute_translation_metrics:398 - Utt: 127247_0517890-0539637
2024-12-12 22:38:39.579 | INFO     | paddlespeech.s2t.exps.u2_st.model:compute_translation_metrics:399 - Ref: 科学 只能 暂时 改变 我们 自动 生成 的 假设 但是 我们 知道 如果 让 你 拿出 一张 照片 , 上面 是 一个 你 知道 的 、 可恶 的 白人 然后 你 把 这张 照片 贴 到 一个 有色人种 旁边 贴 到 一位 出色 的 黑人 旁边 有时候 这样 做 , 也 可以 帮助 我们 解除 脑内 自动 生成 的 联系
2024-12-12 22:38:39.579 | INFO     | paddlespeech.s2t.exps.u2_st.model:compute_translation_metrics:400 - Hyp: 
2024-12-12 22:38:39.580 | INFO     | paddlespeech.s2t.exps.u2_st.model:compute_translation_metrics:401 - One example BLEU = 0.0/0.0/0.0/0.0
2024-12-12 22:38:39.583 | INFO     | paddlespeech.s2t.exps.u2_st.model:test:441 - RTF: 0.000048, instance (79), batch BELU   = 0.000000
^C2024-12-12 22:38:40.270 | INFO     | paddlespeech.s2t.training.timer:__exit__:44 - Test/Decode Done: 0:01:44.328190

@zxcd @Liyulingyue @GreatV @enkilee @yinfan98

paddle-bot · 2024-12-12T14:53:09Z

Thanks for your contribution!

zxcd · 2024-12-16T09:44:45Z

paddlespeech/s2t/io/dataloader.py

@@ -404,6 +404,12 @@ def get_dataloader(mode: str, config, args):
                config['subsampling_factor'] = 1
                config['num_encs'] = 1
                config['shortest_first'] = False
+                config['minibatches'] = 0


load the params from config?

细说？？？
这里的 config 本来就是 clone 过来的，应该是本来就木有这几个值，还需要从哪里 load？最好是有默认值～

zxcd · 2024-12-16T09:45:13Z

paddlespeech/s2t/modules/decoder.py

-            ys_in_lens: paddle.Tensor,
-            r_ys_in_pad: paddle.Tensor=paddle.empty([0]),
-            reverse_weight: float=0.0) -> Tuple[paddle.Tensor, paddle.Tensor]:
+    def forward(self,


only code style changed?

typing hint for the output changed from

-> Tuple[paddle.Tensor, paddle.Tensor]:

to

-> Tuple[paddle.Tensor, paddle.Tensor, paddle.Tensor]:

zxcd

LGTM

megemini added 2 commits December 12, 2024 22:28

[Fix] s2t

f3eb950

[Fix] s2t test

fb3f11a

paddle-bot bot added the contributor label Dec 12, 2024

mergify bot added S2T asr/st Example README labels Dec 12, 2024

zxcd reviewed Dec 16, 2024

View reviewed changes

megemini requested a review from zxcd December 16, 2024 13:46

zxcd approved these changes Dec 18, 2024

View reviewed changes

zxcd merged commit b4c2f3b into PaddlePaddle:develop Dec 18, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hackathon 7th] 修复 `s2t` 示例错误 #3950

[Hackathon 7th] 修复 `s2t` 示例错误 #3950

megemini commented Dec 12, 2024

paddle-bot bot commented Dec 12, 2024

zxcd Dec 16, 2024

megemini Dec 16, 2024

zxcd Dec 18, 2024

zxcd Dec 16, 2024

megemini Dec 16, 2024

zxcd Dec 18, 2024

zxcd left a comment

[Hackathon 7th] 修复 s2t 示例错误 #3950

[Hackathon 7th] 修复 s2t 示例错误 #3950

Conversation

megemini commented Dec 12, 2024

PR types

PR changes

Describe

paddle-bot bot commented Dec 12, 2024

zxcd Dec 16, 2024

Choose a reason for hiding this comment

megemini Dec 16, 2024

Choose a reason for hiding this comment

zxcd Dec 18, 2024

Choose a reason for hiding this comment

zxcd Dec 16, 2024

Choose a reason for hiding this comment

megemini Dec 16, 2024

Choose a reason for hiding this comment

zxcd Dec 18, 2024

Choose a reason for hiding this comment

zxcd left a comment

Choose a reason for hiding this comment

[Hackathon 7th] 修复 `s2t` 示例错误 #3950

[Hackathon 7th] 修复 `s2t` 示例错误 #3950