-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flashlight and Pyctcdecode decoders #8428
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
for more information, see https://pre-commit.ci
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
for more information, see https://pre-commit.ci
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
This PR was closed because it has been inactive for 7 days since being marked as stale. |
This PR was closed because it has been inactive for 7 days since being marked as stale. |
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
if cfg.amp: | ||
if torch.cuda.is_available() and hasattr(torch.cuda, 'amp') and hasattr(torch.cuda.amp, 'autocast'): | ||
logging.info("AMP is enabled!\n") | ||
autocast = torch.cuda.amp.autocast |
Check notice
Code scanning / CodeQL
Unused local variable Note
autocast = torch.cuda.amp.autocast | ||
|
||
else: | ||
autocast = default_autocast |
Check notice
Code scanning / CodeQL
Unused local variable Note
autocast = default_autocast | ||
else: | ||
|
||
autocast = default_autocast |
Check notice
Code scanning / CodeQL
Unused local variable Note
[🤖]: Hi @karpnv 👋, I just wanted to let you know that, you know, a CICD pipeline for this PR just finished successfully ✨ So it might be time to merge this PR or like to get some approvals 🚀 But I'm just a 🤖 so I'll leave it you what to do next. Have a great day! //cc @ko3n1g |
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks it is worth merging now.
@karpov-nick please, fix autocast
/use_amp
in scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
python eval_beamsearch_ngram_ctc.py model_path=<path to the .nemo file of the model> \ | ||
dataset_manifest=<path to the input evaluation JSON manifest file> \ | ||
ctc_decoding.beam.word_kenlm_path=<path to the binary KenLM model> \ | ||
ctc_decoding.beam.nemo_kenlm_path=<path to the binary KenLM model> \ |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
can we merge this? |
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
@karpnv could you fix merge conflicts so this can be merged? |
lexicon_path = os.path.join(tmpdir.name, lexicon[0].name) | ||
SaveRestoreConnector._unpack_nemo_file(path2file=kenlm_path, out_folder=tmpdir.name, members=members) | ||
cfg = OmegaConf.load(config_path) | ||
return tmpdir, cfg.encoding_level, kenlm_model_path, lexicon_path |
Check failure
Code scanning / CodeQL
Potentially uninitialized local variable Error
try: | ||
self.tmpdir, self.kenlm_encoding_level, self.kenlm_path, lexicon_path = get_nemolm(kenlm_path) | ||
if not self.flashlight_cfg.lexicon_path: | ||
self.flashlight_cfg.lexicon_path = lexicon_path |
Check failure
Code scanning / CodeQL
Potentially uninitialized local variable Error
Preserve Flashlight and Pyctcdecode beamsearch with Ngram LM
Support Flashlight and Pyctcdecode decoding with pure KenLM and NeMo KenLM
Standardize API of CLI inference scripts
Collection: ASR
Changelog
-- Get logprobs from Hypothesis
-- Use "pyctcdecode" strategy as default beamsearch algorithm denoted as "beam"
-- Remove default seq2seq strategy
-- Check decoding_type and search_type combinations
-- Support empty string in nemo_kenlm_path and word_kenlm_path for beamsearch without LM (ZeroLM)
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Additional Information