Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moses tokenizer/detokenizer in cli options #16

Open
George0828Zhang opened this issue Jul 12, 2021 · 0 comments
Open

Moses tokenizer/detokenizer in cli options #16

George0828Zhang opened this issue Jul 12, 2021 · 0 comments

Comments

@George0828Zhang
Copy link

Hi, I'm wondering if you can add moses detokenizer in the cli options? More specifically, when post processing prediction and reference, the " ".join() function can be optionally replaced by sacremoses's detokenizer.

Thanks.

xutaima added a commit that referenced this issue Dec 7, 2022
* Remove the starting silence in target speech

* Add new max_len setting

* Write score for score only option

* Update readme for s2st

* Add postprocessors

* Check fairseq agent logic

* Make TTS an postprocessor

* Refactored

* Change w2v path to absolute path

* Remove pdb

* Add build_postprocesor function to agent class

* Refactored

* Minor fix

* Test time unit waitk agent

* Fix the logic of determine online mode

* Remove debug code

* Turn off model loading logs

* Prevent unit early stops

* Add max length

* Add import_user_module

* Remove unity agent

* add back unity agent

* Add back online extractor

* Remove unity codes

* Remove pdb

* Remove unity agent

* Fix bugs running s2t agent

* Fix a bug generating target audio

* Formatting

* Adding debug log for intermediate text

* Create instead of append when index = 0

* Fix a bug not considering EOS in tts postprocessor

* Record delay before instance is finished

* Write speech latency numbers to logs

* Typo

* Use length of indices to determine the max_len

* Fix a bug generating target audio

* Formatting

* Adding debug log for intermediate text

* Create instead of append when index = 0

* Fix a bug not considering EOS in tts postprocessor

* Record delay before instance is finished

* Write speech latency numbers to logs

* Typo

* Do not record speech audio in logs

* FIx issue of wrong end index

* Handles when input of TTS system is all punctuations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant