-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changed asr models outputs to be consistent #11818
base: main
Are you sure you want to change the base?
Conversation
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py
Fixed
Show fixed
Hide fixed
ed59fd2
to
3ca6a8a
Compare
3ca6a8a
to
4232b64
Compare
beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base. Your code was analyzed with PyLint. The following annotations have been identified:
Mitigation guide:
By applying these rules, we reduce the occurance of this message in future. Thank you for improving NeMo's documentation! |
Overall looks good! Thank you.
|
@@ -31,6 +31,7 @@ | |||
from nemo.collections.asr.parts.utils import manifest_utils | |||
from nemo.collections.common.data.utils import move_data_to_device | |||
from nemo.utils import logging, logging_mode | |||
from nemo.utils.decorators import deprecated |
Check notice
Code scanning / CodeQL
Unused import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 13 minutes ago
To fix the problem, we need to remove the unused import statement. This will clean up the code and remove the unnecessary dependency. The best way to fix this is to delete the line that imports deprecated
from nemo.utils.decorators
. This change should be made in the file nemo/collections/asr/parts/mixins/transcription.py
.
-
Copy modified line R34
@@ -33,3 +33,3 @@ | ||
from nemo.utils import logging, logging_mode | ||
from nemo.utils.decorators import deprecated | ||
|
||
|
Signed-off-by: Ssofja <sofiakostandian@gmail.com>
Signed-off-by: Ssofja <Ssofja@users.noreply.github.com> Signed-off-by: Ssofja <sofiakostandian@gmail.com>
Signed-off-by: Ssofja <Ssofja@users.noreply.github.com>
Signed-off-by: Ssofja <sofiakostandian@gmail.com>
Signed-off-by: Ssofja <Ssofja@users.noreply.github.com>
ee896e4
to
8549980
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments, otherwise LGTM
@@ -31,6 +31,7 @@ | |||
from nemo.collections.asr.parts.utils import manifest_utils | |||
from nemo.collections.common.data.utils import move_data_to_device | |||
from nemo.utils import logging, logging_mode | |||
from nemo.utils.decorators import deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this as its not used
@@ -68,6 +69,14 @@ class TranscribeConfig: | |||
|
|||
_internal: Optional[InternalTranscribeConfig] = None | |||
|
|||
@property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write a TODO here so as to remove later
num_workers=num_workers, | ||
channel_selector=channel_selector, | ||
augmentor=augmentor, | ||
verbose=verbose, | ||
timestamps=timestamps, | ||
**config_kwargs, | ||
) | ||
transcribe_cfg.return_hypotheses = return_hypotheses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it return_all_hypotheses
here or return_hypotheses
@@ -898,7 +898,7 @@ | |||
" print(f\"[ref text]: {target_transcripts[idx]}\")\n", | |||
" else:\n", | |||
" # if no spotted words, use standard greedy predictions\n", | |||
" pred_text = ctc_model.wer.decoding.ctc_decoder_predictions_tensor(greedy_predicts)[0][0]" | |||
" pred_text = ctc_model.wer.decoding.ctc_decoder_predictions_tensor(greedy_predicts)[0]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this supposed to be ctc_model.wer.decoding.ctc_decoder_predictions_tensor(greedy_predicts)[0].text
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only glanced at it but this PR is full of breaking changes. Upto @nithinraok if he's ok with it, I don't see the reason for most of these naming changes.
nemo/collections/asr/metrics/bleu.py
Outdated
@@ -116,7 +116,7 @@ def __init__( | |||
encoder_hidden_states=predictions, | |||
encoder_input_mask=predictions_mask, | |||
decoder_input_ids=input_ids, | |||
return_hypotheses=False, | |||
return_all_hypotheses=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to make this pedantic change in the name and break older inference ? Revert please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tried to handle all cases but will revert this for now as it includes other breaking changes
@@ -56,7 +57,7 @@ class InternalTranscribeConfig: | |||
@dataclass | |||
class TranscribeConfig: | |||
batch_size: int = 4 | |||
return_hypotheses: bool = False | |||
return_all_hypotheses: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't make this pedantic change please @nithinraok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets revert this @Ssofja but keep timestep
to timestamp
change.
Signed-off-by: Ssofja <sofiakostandian@gmail.com>
bd17125
to
3987d43
Compare
Signed-off-by: Ssofja <Ssofja@users.noreply.github.com>
What does this PR do ?
This PR is making Some of ASR models outputs consistent
Collection: ASR
GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information